22 Dec DataFrames in Pandas
The Pandas DataFrame is a Two-dimensional tabular data structure i.e. table with rows and columns. The DataFrame() method is used for this purpose and has the following parameters:
- data: The data to be stored in the Pandas DataFrame
- index: The index values to be provided for the resultant frame.
- columns: Set the column labels for the resultant frame if data does not mention before
- dtype: It is the datatype and only a single type is allowed.
- copy: To copy the input data
In this lesson, we will see some examples to:
- Create a Pandas DataFrame
- Access a group of rows or columns in a Pandas DataFrame
- Access a group of rows or columns by integer positions in a Pandas DataFrame
- Name your own indexes in a Pandas DataFrame
- Iterating a DataFrame
Create a Pandas DataFrame
To create a dataframe in pandas, use the pandas.DataFrame() method. Let us see an example wherein we have student records:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
import pandas as pd # Dataset data = { 'student': ["Amit", "John", "Jacob", "David", "Steve"], 'rank': [1, 4, 3, 5, 2], 'marks': [95, 70, 80, 60, 90] } df = pd.DataFrame(data) print("Student Records\n\n",df) |
Output
1 2 3 4 5 6 7 8 9 10 |
Student Records student rank marks 0 Amit 1 95 1 John 4 70 2 Jacob 3 80 3 David 5 60 4 Steve 2 90 |
The 0, 1, 2, etc. are the index or label that gets automatically added to the table.
Access a group of rows or columns in a Pandas DataFrame
The dataframe.loc is used in Pandas to access a group of rows or columns in a DataFrame. Let us see an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
import pandas as pd # Dataset data = { 'Student': ["Amit", "John", "Jacob", "David", "Steve"], 'Rank': [1, 4, 3, 5, 2], 'Marks': [95, 70, 80, 60, 90] } # Create a DataFrame using the DataFrame() method with index df = pd.DataFrame(data, index=['RowA', 'RowB', 'RowC', 'RowD', 'RowE'],) print("Student Records\n\n",df) # Access the value in the student column corresponding to the RowA label print("\nValue = ",df.loc['RowA', 'Student']) |
Output
1 2 3 4 5 6 7 8 9 10 11 12 |
Student Records Student Rank Marks RowA Amit 1 95 RowB John 4 70 RowC Jacob 3 80 RowD David 5 60 RowE Steve 2 90 Value = Amit |
Access a group of rows or columns by integer positions in a Pandas DataFrame
The dataframe.iloc is used to access a group of rows or columns by integers. We have also set columns and indexes. Let us see an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
import pandas as pd # Dataset data = { 'Student': ["Amit", "John", "Jacob", "David", "Steve"], 'Rank': [1, 4, 3, 5, 2], 'Marks': [95, 70, 80, 60, 90] } # Create a DataFrame using the DataFrame() method with index df = pd.DataFrame(data, index=['RowA', 'RowB', 'RowC', 'RowD', 'RowE'],) print("Student Records\n\n",df) # Access using rows and columns by integer positions print("\nValue = \n",df.iloc[[1,2]]) |
Output
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
Student Records Student Rank Marks RowA Amit 1 95 RowB John 4 70 RowC Jacob 3 80 RowD David 5 60 RowE Steve 2 90 Value = Student Rank Marks RowB John 4 70 RowC Jacob 3 80 |
Name your indexes in a Pandas DataFrame
The index argument is used to set and name your indexes in a DataFrame. Let us see an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
import pandas as pd # Dataset data = { 'Student': ["Amit", "John", "Jacob", "David", "Steve"], 'Rank': [1, 4, 3, 5, 2], 'Marks': [95, 70, 80, 60, 90] } # Create a DataFrame using the DataFrame() method # The index argument is used to set the index df = pd.DataFrame(data, index=['Student1', 'Student2', 'Student3', 'Student4', 'Student5'],) print("Student Records\n\n",df) |
Output
1 2 3 4 5 6 7 8 9 10 |
Student Records Student Rank Marks Student1 Amit 1 95 Student2 John 4 70 Student3 Jacob 3 80 Student4 David 5 60 Student5 Steve 2 90 |
Iterate a DataFrame
To iterate a DataFrame and display the column names, use the for loop as in the below example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
import pandas as pd # Dataset data = { 'Student': ["Amit", "John", "Jacob", "David", "Steve"], 'Rank': [1, 4, 3, 5, 2], 'Marks': [95, 70, 80, 60, 90] } # Create a DataFrame using the DataFrame() method # The index argument is used to set the index df = pd.DataFrame(data, index=['Student1', 'Student2', 'Student3', 'Student4', 'Student5'], ) print("Student Records\n\n", df) # Iterating to display the columns print("\nDisplaying the columns:") for col in df: print(col) |
Output
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
Student Records Student Rank Marks Student1 Amit 1 95 Student2 John 4 70 Student3 Jacob 3 80 Student4 David 5 60 Student5 Steve 2 90 Displaying the columns: Student Rank Marks |
If you liked the tutorial, spread the word and share the link and our website Studyopedia with others:
For Videos, Join Our YouTube Channel: Join Now
Read More:
No Comments