Categorical Data in Pandas

In this lesson, we will learn how to work with Categorical data in Pandas. It is a Pandas data type corresponding to categorical variables in statistics. A categorical variable takes on a limited number of possible values. Examples are gender, blood type, country affiliation, rating, etc.

Let us see two examples:

  • Create Categorical Series in Pandas
  • Create Categorical DataFrame in Pandas

Before moving further, we’ve prepared a video tutorial to understand categorical data in Pandas:

Create Categorical Series

Use the dtype=”category” while creating a series to create a Categorical Series. Let us see an example:

import pandas as pd

# Creating a Categorical Series 
s = pd.Series(["p", "q", "r", "s", "q"], dtype="category")

# Display the Series
print("Series = \n", s)

Output

Series = 
0    p
1    q
2    r
3    s
4    q
dtype: category
Categories (4, object): [p, q, r, s]

Create Categorical DataFrame

Use the dtype=”category” while creating a DataFrame to create a Categorical DataFrame. Let us see an example. We have created 3 categories here:

import pandas as pd

# Creating a Categorical DataFrame 
df = pd.DataFrame({"Cat1": list("pqrs"), "Cat2": list("pqrp"), "Cat3": list("qrrr")}, dtype="category")

# Display the DataFrame
print("DataFrame = \n", df)

# Display the datatypes
print("\nDataType of each column = \n", df.dtypes)

Output

DataFrame = 
   Cat1 Cat2 Cat3
0    p    p    q
1    q    q    r
2    r    r    r
3    s    p    r

DataType of each column = 
Cat1    category
Cat2    category
Cat3    category
dtype: object

If you liked the tutorial, spread the word and share the link and our website Studyopedia with others.


For Videos, Join Our YouTube Channel: Join Now


Read More:

Pandas - Statistical Functions
Working with Categories in Pandas
Studyopedia Editorial Staff
contact@studyopedia.com

We work to create programming tutorials for all.

No Comments

Post A Comment