Find and Remove Duplicates from rows in Pandas

To find and remove duplicates from rows in a Pandas DataFrame or Series, use the duplicated() and drop_duplicates() methods respectively.

Before moving further, we’ve prepared a video tutorial to find and remove duplicates from rows in Pandas:

Find Duplicates

To find duplicates from rows in a Pandas DataFrame or Series, use the duplicated() method. It returns a Series with True and False values i.e. for duplicate rows True is returned.

Let us see an example:

import pandas as pd

# Dataset
data = {
    'student': ["Amit", "John", "Amit", "David", "Steve"],
    'rank': [1, 4, 1, 5, 3],
    'marks': [95, 70, 95, 60, 90]
}

df = pd.DataFrame(data)

print("Student Records\n\n", df)

# Find duplicates
res = df.duplicated()
print("\nDescribing Duplicates:\n",res)

Output

Student Records

   student  rank  marks
0    Amit     1     95
1    John     4     70
2    Amit     1     95
3   David     5     60
4   Steve     3     90

Describing Duplicates:
0    False
1    False
2     True
3    False
4    False

Remove Duplicates

To remove duplicates from rows in a Pandas DataFrame or Series, use the drop_duplicates() method. Let us see an example:

import pandas as pd

# Dataset
data = {
    'student': ["Amit", "John", "Amit", "David", "Steve"],
    'rank': [1, 4, 1, 5, 3],
    'marks': [95, 70, 95, 60, 90]
}

df = pd.DataFrame(data)

print("Student Records\n\n", df)

# Delete duplicates using the drop_duplicates()
res = df.drop_duplicates()
print("\nNew DataFrame after deleting duplicates:\n",res)

Output

Student Records

   student  rank  marks
0    Amit     1     95
1    John     4     70
2    Amit     1     95
3   David     5     60
4   Steve     3     90

New DataFrame:
   student  rank  marks
0    Amit     1     95
1    John     4     70
3   David     5     60
4   Steve     3     90

If you liked the tutorial, spread the word and share the link and our website Studyopedia with others.


For Videos, Join Our YouTube Channel: Join Now


Read More:

Remove Whitespace or specific characters in Pandas
Pandas Series - Attributes and Methods
Studyopedia Editorial Staff
contact@studyopedia.com

We work to create programming tutorials for all.

No Comments

Post A Comment