18 Apr Create Pivot Table with Pandas
Pivot Table reshapes data to summarize values across two dimensions, like creating a matrix of averages or sums.
Create pivot tables with pivot_table() to summarize data by categories, applying aggregation functions.
The pivot_table() in pandas is one of the most powerful tools for summarizing and analyzing data. Think of it as an Excel-style pivot table but with the flexibility of Python. It aggregates data across categories.
Syntax
pd.pivot_table(
data,
values=None,
index=None,
columns=None,
aggfunc='mean',
fill_value=None,
margins=False,
margins_name='All',
dropna=True,
observed=False,
sort=True
)
Here,
- data: The DataFrame you want to pivot.
- values: Column(s) to aggregate (e.g., “Sales”).
- index: Column(s) to group by for rows.
- columns: Column(s) to group by for columns.
- aggfunc: Aggregation function(s). Default is ‘mean’. Can be ‘sum’, ‘count’, ‘max’, ‘min’, or even custom functions.
- fill_value: Replace missing values with a number (e.g., 0).
- margins: If True, adds totals (like Excel’s “Grand Total”).
- margins_name: Name for the totals row/column (default “All”).
- dropna: If True, drops columns where all values are NaN.
- observed: For categorical data, if True, only shows observed categories.
- sort: If True, sorts the resulting pivot table by index/columns.
Let us see the following examples to understand how to create a Pivot Table in Pandas:
- Create a basic pivot table
- Create a pivot table with multiple aggregations
- Create a pivot table with grand total using margins = True
- Create a pivot table with missing values replaced with 0
Create a basic pivot table
Let us see an example to create a pivot table from the DataFrame to summarize sales. It shows the total sales of each product grouped by region, making comparisons easier:
# A basic pivot table to display the total sales of each product by region
import pandas as pd
data = {
'Region': ['North', 'South', 'East', 'West', 'North', 'South'],
'Product': ['A', 'A', 'B', 'B', 'A', 'B'],
'Sales': [100, 150, 200, 130, 120, 170]
}
df = pd.DataFrame(data)
print("DataFrame:\n",df)
pivot = pd.pivot_table(df, values='Sales', index='Region', columns='Product')
print("\nPivot Table:\n",pivot)
Output
DataFrame: Region Product Sales 0 North A 100 1 South A 150 2 East B 200 3 West B 130 4 North A 120 5 South B 170 Pivot Table: Product A B Region East NaN 200.0 North 110.0 NaN South 150.0 170.0 West NaN 130.0
Create a pivot table with multiple aggregations
Let us see an example to create a pivot table with multiple aggregation functions. It displays both the sum and mean of sales side by side:
# A pivot table with multiple aggregations (sum and mean side by side)
import pandas as pd
data = {
'Region': ['North', 'South', 'East', 'West', 'North', 'South'],
'Product': ['A', 'A', 'B', 'B', 'A', 'B'],
'Sales': [100, 150, 200, 130, 120, 170]
}
df = pd.DataFrame(data)
print("DataFrame:\n",df)
pivot = pd.pivot_table(df, values='Sales', index='Region', columns='Product', aggfunc=['sum', 'mean'])
print("\nPivot Table:\n",pivot)
Output
DataFrame:
Region Product Sales
0 North A 100
1 South A 150
2 East B 200
3 West B 130
4 North A 120
5 South B 170
Pivot Table:
sum mean
Product A B A B
Region
East NaN 200.0 NaN 200.0
North 220.0 NaN 110.0 NaN
South 150.0 170.0 150.0 170.0
West NaN 130.0 NaN 130.0
Create a pivot table with grand total using margins = True
Let us see an example to create a pivot table with margins enabled. This adds a grand total row and column, showing overall sales across all regions and products:
# A pivot table with grand total using margins = True
import pandas as pd
data = {
'Region': ['North', 'South', 'East', 'West', 'North', 'South'],
'Product': ['A', 'A', 'B', 'B', 'A', 'B'],
'Sales': [100, 150, 200, 130, 120, 170]
}
df = pd.DataFrame(data)
print("DataFrame:\n",df)
pivot = pd.pivot_table(df, values='Sales', index='Region', columns='Product', aggfunc=['sum'], margins=True)
print("\nPivot Table:\n",pivot)
Output
DataFrame:
Region Product Sales
0 North A 100
1 South A 150
2 East B 200
3 West B 130
4 North A 120
5 South B 170
Pivot Table:
sum
Product A B All
Region
East NaN 200.0 200
North 220.0 NaN 220
South 150.0 170.0 320
West NaN 130.0 130
All 370.0 500.0 870
Create a pivot table with missing values replaced with 0
Let us see an example to create a pivot table with missing values replaced by 0. This ensures empty cells are filled, while also showing grand totals for regions and products:
# A pivot table with missing values replaced with 0
import pandas as pd
data = {
'Region': ['North', 'South', 'East', 'West', 'North', 'South'],
'Product': ['A', 'A', 'B', 'B', 'A', 'B'],
'Sales': [100, 150, 200, 130, 120, 170]
}
df = pd.DataFrame(data)
print("DataFrame:\n",df)
pivot = pd.pivot_table(df, values='Sales', index='Region', columns='Product', aggfunc=['sum'], fill_value= 0,margins=True, margins_name='Total')
# pivot2 = pivot.fillna(0)
# print("\nPivot Table:\n",pivot2)
print("\nPivot Table:\n",pivot)
Output
DataFrame:
Region Product Sales
0 North A 100
1 South A 150
2 East B 200
3 West B 130
4 North A 120
5 South B 170
Pivot Table:
sum
Product A B Total
Region
East 0 200 200
North 220 0 220
South 150 170 320
West 0 130 130
Total 370 500 870
If you liked the tutorial, spread the word and share the link and our website, Studyopedia, with others:
For Videos, Join Our YouTube Channel: Join Now
Read More:
No Comments