18 Apr The apply() Function for Data Transformation in Pandas
The apply() function lets you run a custom operation on each element in a column, enabling flexible data manipulation. Use apply() to apply a function element-wise on a column or DataFrame. It’s great for custom transformations. Let us see some examples to:
- Implement the apply() function in Pandas
- Get the sum of each row by applying a function using apply()
- Get the sum of each row by using Pandas apply() with NumPy
Implement the apply() function in Pandas
Let us see an example to show how to transform data in a pandas DataFrame by applying a function to a column:
# The apply() function in Pandas
import pandas as pd
data = ({'num': [5, 9, 15]})
df = pd.DataFrame(data)
print(df)
df['square'] = df['num'].apply(lambda x:x**2)
print(df)
Output
num
0 5
1 9
2 15
num square
0 5 25
1 9 81
2 15 225
Here, the apply() method applies a lambda function to square each value in the ‘num’ column and stores the result in a new ‘square’ column.
In the above example, we saw a pandas DataFrame operation that demonstrates how to create a new column by applying a function to an existing one.
Get the sum of each row by applying a function using apply()
Let us see an example to demonstrate how to use apply() in pandas. It applies a custom function to either columns (default, sums each column) or rows (axis=1, sums each row):
# Get the sum of each row by applying a function using apply()
import pandas as pd
def demo(res):
return res.sum()
data = {
"Maths": [92, 90, 88],
"Science": [80, 80, 99]
}
df = pd.DataFrame(data)
# axis = 0 is the default
res = df.apply(demo)
# res = df.apply(demo, axis = 1)
print(res)
Output
Maths 270 Science 259 dtype: int64
Get the sum of each row by using Pandas apply() with NumPy
Let us see an example to apply NumPy’s sum() function across the DataFrame using Pandas apply().
This calculates the column-wise totals (Maths and Science) instead of row sums, since axis=0 is the default.
# Get the sum of each row by using Pandas apply() with NumPy
import pandas as pd
import numpy as np
data = {
"Maths": [92, 90, 88],
"Science": [80, 80, 99]
}
df = pd.DataFrame(data)
# axis = 0 is the default
res = df.apply(np.sum)
print(res)
Output
Maths 270 Science 259 dtype: int64
If you liked the tutorial, spread the word and share the link and our website, Studyopedia, with others:
For Videos, Join Our YouTube Channel: Join Now
Read More:
No Comments