22 Dec String Operations on Text Data in Pandas
We can easily perform operations on strings in Pandas using the string methods. In this lesson, we will see how to perform the following string operations on text data in the Pandas Series:
- lower(): Perform lowercase on text data
- upper(): Perform uppercase on text data
- title(): Convert text data to camel case
- len(): To get the length of each element in the Series.
- count(): Count the non-empty cells for each column or row
- contain(): Search for a value in a column.
Before moving further, we’ve prepared a video tutorial to to implement string operations on text data in Pandas:
lower() method
To lowercase text data, use the lower() method in Pandas. Let us see an example:
import pandas as pd
# Data to be stored in the Pandas Series
data = ['Jacob', 'Amit', 'TRENT', 'Nathan', 'MaRtIN']
# Create a Series using the Series() method
s = pd.Series(data)
# Display the Series
print("Series: \n", s)
# Convert the text data to lowercase
print("\nLowercase data:\n",s.str.lower())
Output
Series: 0 Jacob 1 Amit 2 TRENT 3 Nathan 4 MaRtIN Lowercase data: 0 jacob 1 amit 2 trent 3 nathan 4 martin
upper() method
To uppercase text data, use the upper() method in Pandas. Let us see an example:
import pandas as pd
# Data to be stored in the Pandas Series
data = ['jaCoB', 'Amit', 'trent', 'Nathan', 'MaRtIN']
# Create a Series using the Series() method
s = pd.Series(data)
# Display the Series
print("Series: \n", s)
# Convert the text data to uppercase
print("\nUppercase data:\n",s.str.upper())
Output
Series: 0 jaCoB 1 Amit 2 trent 3 Nathan 4 MaRtIN Uppercase data: 0 JACOB 1 AMIT 2 TRENT 3 NATHAN 4 MARTIN
title() method
To convert the text data to camel case, use the title() method in Pandas. Let us see an example:
import pandas as pd
# Data to be stored in the Pandas Series
data = ['jaCoB', 'Amit', 'trent', 'NATHan', 'MaRtIN']
# Create a Series using the Series() method
s = pd.Series(data)
# Display the Series
print("Series: \n", s)
# Convert the text data to camel case
print("\nCamel case data:\n",s.str.title())
Output
Series: 0 jaCoB 1 Amit 2 trent 3 NATHan 4 MaRtIN Camel case data: 0 Jacob 1 Amit 2 Trent 3 Nathan 4 Martin
len() method
To get the length of each element in the Series, use the len() method in Pandas. Let us see an example:
import pandas as pd
# Data to be stored in the Pandas Series
data = ['Jacob Oram', 'Amit', 'Trent', 'Nathan Lyon', 'Martin']
# Create a Series using the Series() method
s = pd.Series(data)
# Display the Series
print("Series: \n", s)
# Get the length of each element
print("\nLength:\n",s.str.len())
Output
Series: 0 Jacob Oram 1 Amit 2 Trent 3 Nathan Lyon 4 Martin Length: 0 10 1 4 2 5 3 11
count() method
To count the non-empty cells for each column or row in a Series, use the count() method. Let us see an example. We have stored the data in the series with some NaN values:
import numpy as np
import pandas as pd
# Data to be stored in the Pandas Series
data = [np.nan, "Amit Diwan", "Trent", "Nathan Lyon", np.nan]
# Create a Series using the Series() method
series = pd.Series(data)
# Display the Series
print("Series:\n", series)
# Get the count
print("\nCount:\n", series.count())
Output
Series: 0 NaN 1 Amit Diwan 2 Trent 3 Nathan Lyon 4 NaN dtype: object Count: 3
contains() method
The contains() method is used in Pandas to search for a value in a column. Let us see an example:
import pandas as pd
# Data to be stored in the Pandas Series
data = ['Jacob Oram', 'Amit', 'Trent', 'Nathan Lyon', 'Martin']
# Create a Series using the Series() method
s = pd.Series(data)
# Display the Series
print("Series: \n", s)
# Search for a specific value
print("\nDoes the specific value exist?\n",s.str.contains('Amit'))
Output
Series: 0 Jacob Oram 1 Amit 2 Trent 3 Nathan Lyon 4 Martin Does the specific value exist? 0 False 1 True 2 False 3 False 4 False
If you liked the tutorial, spread the word and share the link and our website Studyopedia with others.
For Videos, Join Our YouTube Channel: Join Now
Read More:
No Comments