24 Jan Categorical Data in Pandas
In this lesson, we will learn how to work with Categorical data in Pandas. It is a Pandas data type corresponding to categorical variables in statistics. A categorical variable takes on a limited number of possible values. Examples are gender, blood type, country affiliation, rating, etc.
Let us see two examples:
- Create Categorical Series in Pandas
- Create Categorical DataFrame in Pandas
Before moving further, we’ve prepared a video tutorial to understand categorical data in Pandas:
Create Categorical Series
Use the dtype=”category” while creating a series to create a Categorical Series. Let us see an example:
import pandas as pd
# Creating a Categorical Series
s = pd.Series(["p", "q", "r", "s", "q"], dtype="category")
# Display the Series
print("Series = \n", s)
Output
Series = 0 p 1 q 2 r 3 s 4 q dtype: category Categories (4, object): [p, q, r, s]
Create Categorical DataFrame
Use the dtype=”category” while creating a DataFrame to create a Categorical DataFrame. Let us see an example. We have created 3 categories here:
import pandas as pd
# Creating a Categorical DataFrame
df = pd.DataFrame({"Cat1": list("pqrs"), "Cat2": list("pqrp"), "Cat3": list("qrrr")}, dtype="category")
# Display the DataFrame
print("DataFrame = \n", df)
# Display the datatypes
print("\nDataType of each column = \n", df.dtypes)
Output
DataFrame = Cat1 Cat2 Cat3 0 p p q 1 q q r 2 r r r 3 s p r DataType of each column = Cat1 category Cat2 category Cat3 category dtype: object
If you liked the tutorial, spread the word and share the link and our website Studyopedia with others.
For Videos, Join Our YouTube Channel: Join Now
Read More:
No Comments