The apply() Function for Data Transformation in Pandas

The apply() function lets you run a custom operation on each element in a column, enabling flexible data manipulation. Use apply() to apply a function element-wise on a column or DataFrame. It’s great for custom transformations. Let us  see some examples to:

  • Implement the apply() function in Pandas
  • Get the sum of each row by applying a function using apply()
  • Get the sum of each row by using Pandas apply() with NumPy

Implement the apply() function in Pandas

Let us see an example to show how to transform data in a pandas DataFrame by applying a function to a column:

# The apply() function in Pandas

import pandas as pd

data = ({'num': [5, 9, 15]})

df = pd.DataFrame(data)
print(df)

df['square'] = df['num'].apply(lambda x:x**2)

print(df)

Output

   num
0    5
1    9
2   15
    num  square
0    5      25
1    9      81
2   15     225

Here, the apply() method applies a lambda function to square each value in the ‘num’ column and stores the result in a new ‘square’ column.

In the above example, we saw a pandas DataFrame operation that demonstrates how to create a new column by applying a function to an existing one.

Get the sum of each row by applying a function using apply()

Let us see an example to demonstrate how to use apply() in pandas. It applies a custom function to either columns (default, sums each column) or rows (axis=1, sums each row):

# Get the sum of each row by applying a function using apply()

import pandas as pd

def demo(res):
    return res.sum()

data = {
    "Maths": [92, 90, 88],
    "Science": [80, 80, 99]
}

df = pd.DataFrame(data)

# axis = 0 is the default
res = df.apply(demo)

# res = df.apply(demo, axis = 1)

print(res)

Output

Maths      270
Science    259
dtype: int64

Get the sum of each row by using Pandas apply() with NumPy

Let us see an example to apply NumPy’s sum() function across the DataFrame using Pandas apply().
This calculates the column-wise totals (Maths and Science) instead of row sums, since axis=0 is the default.

# Get the sum of each row by using Pandas apply() with NumPy

import pandas as pd
import numpy as np

data = {
    "Maths": [92, 90, 88],
    "Science": [80, 80, 99]
}

df = pd.DataFrame(data)

# axis = 0 is the default
res = df.apply(np.sum)

print(res)

Output

Maths      270
Science    259
dtype: int64

If you liked the tutorial, spread the word and share the link and our website, Studyopedia, with others:


For Videos, Join Our YouTube Channel: Join Now


Read More:

Drop Rows in Pandas
Group By and Aggregate in Pandas
Studyopedia Editorial Staff
contact@studyopedia.com

We work to create programming tutorials for all.

No Comments

Post A Comment