18 Apr Group By and Aggregate in Pandas
The groupby() groups data by one or more columns, and agg() applies aggregation functions like sum or mean. The groupby() method is used to split data into groups based on some criteria (like a column value). Once grouped, you can apply aggregation functions such as sum(), mean(), count(), etc.
Let us see two examples:
- Group by and aggregate to calculate the sum
- Group by and aggregate to calculate the mean
Group by and aggregate to calculate the sum
Let us see an example to group the DataFrame by Category and calculate the sum of Value for each group:
# Groupby and aggregate (sum)
import pandas as pd
data = ({'Category': ['A', 'A', 'B', 'C', 'A'],
'Value' : [10, 20, 30, 40, 50 ]})
df = pd.DataFrame(data)
print(df)
print(df.groupby('Category').agg({'Value': 'sum'}))
Output
Category Value
0 A 10
1 A 20
2 B 30
3 C 40
4 A 50
Value
Category
A 80
B 30
C 40
Group by and aggregate to calculate the mean
Let us see an example to group the DataFrame by car brand and compute the average mileage for each brand.
# Groupby and aggregate (mean)
# Find the average mileage for each car brand
import pandas as pd
data = {
'mileage': [15, 12, 17, 18, 20],
'model': ['Sonet', 'Seltos', 'Creta', 'Carens', 'Amaze'],
'car': ['Kia', 'Kia', 'Hyundai', 'Kia', 'Hyundai']
}
df = pd.DataFrame(data)
print(df)
# print(df.groupby(["car"]).mean())
print(df.groupby("car")["mileage"].mean())
Output
mileage model car 0 15 Sonet Kia 1 12 Seltos Kia 2 17 Creta Hyundai 3 18 Carens Kia 4 20 Amaze Hyundai car Hyundai 18.5 Kia 15.0 Name: mileage, dtype: float64
If you liked the tutorial, spread the word and share the link and our website, Studyopedia, with others:
For Videos, Join Our YouTube Channel: Join Now
Read More:
No Comments