14 Feb R Data Frames
A Data Frame is where data of different types are stored in a table form i.e. rows and columns. The name of rows should be unique. The column name can never be empty. Even if Data Frame has data of different types, but the datatype of each column should have similar data type.
We will cover the following topics:
- Create a Data Frame in R
- Access Data Frame Items
- Count the Rows and Columns in a Data Frame
- Length of a Data Frame
- Summarize the Data in the Data Frame
Create a Data Frame in R
To create a Data Frame in the R programming language, use the data.frame() method. Let us see an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
# Create a Data Frame # We have four columns in the Data Frame my_DataFrame <- data.frame ( ID = c("S01", "S02", "S03", "S04", "S05"), Name = c("Amit", "John", "David", "Virat", "Jacob"), Marks = c(99, 90, 85, 97, 78), Points = c(150.40, 120.30, 105.50, 135.60, 100.50) ) # Display the DataFrame my_DataFrame |
Output
1 2 3 4 5 6 7 8 |
ID Name Marks Points 1 S01 Amit 99 150.4 2 S02 John 90 120.3 3 S03 David 85 105.5 4 S04 Virat 97 135.6 5 S05 Jacob 78 100.5 |
Access Data Frame Items
To access data frame items in the R programming language, use any of the following:
- Single Bracket
- Double Brackets
- Dollar sign
Let us see the examples of all the three ways mentioned above, beginning with single bracket
Access items using a Single Bracket
Let us see an example to access data frame items using [ ]. Under the bracket, mention the column number of the column you want to access:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# Create a Data Frame # We have four columns in the Data Frame my_DataFrame <- data.frame ( ID = c("S01", "S02", "S03", "S04", "S05"), Name = c("Amit", "John", "David", "Virat", "Jacob"), Marks = c(99, 90, 85, 97, 78), Points = c(150.40, 120.30, 105.50, 135.60, 100.50) ) # Display the Data Frame my_DataFrame # Access items using a single bracket # Mention the column number in the single bracket my_DataFrame[2] |
Output
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
ID Name Marks Points 1 S01 Amit 99 150.4 2 S02 John 90 120.3 3 S03 David 85 105.5 4 S04 Virat 97 135.6 5 S05 Jacob 78 100.5 Name 1 Amit 2 John 3 David 4 Virat 5 Jacob |
Access items using Double Brackets
Let us see an example to access data frame items using [[ ]]. Under the double brackets, mention the column name of the column you want to access:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# Create a Data Frame # We have four columns in the Data Frame my_DataFrame <- data.frame ( ID = c("S01", "S02", "S03", "S04", "S05"), Name = c("Amit", "John", "David", "Virat", "Jacob"), Marks = c(99, 90, 85, 97, 78), Points = c(150.40, 120.30, 105.50, 135.60, 100.50) ) # Display the DataFrame my_DataFrame # Access items using double brackets # Mention the column name in the double brackets my_DataFrame[["Name"]] |
Output
1 2 3 4 5 6 7 8 9 10 |
ID Name Marks Points 1 S01 Amit 99 150.4 2 S02 John 90 120.3 3 S03 David 85 105.5 4 S04 Virat 97 135.6 5 S05 Jacob 78 100.5 [1] Amit John David Virat Jacob Levels: Amit David Jacob John Virat |
Access items using $
Let us see an example to access data frame items in the R programming language using $:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# Create a Data Frame # We have four columns in the Data Frame my_DataFrame <- data.frame ( ID = c("S01", "S02", "S03", "S04", "S05"), Name = c("Amit", "John", "David", "Virat", "Jacob"), Marks = c(99, 90, 85, 97, 78), Points = c(150.40, 120.30, 105.50, 135.60, 100.50) ) # Display the DataFrame my_DataFrame # Access items using $ # Mention the column name preceded by the $ sign my_DataFrame$Marks |
Output
1 2 3 4 5 6 7 8 9 |
ID Name Marks Points 1 S01 Amit 99 150.4 2 S02 John 90 120.3 3 S03 David 85 105.5 4 S04 Virat 97 135.6 5 S05 Jacob 78 100.5 [1] 99 90 85 97 78 |
Count the Rows and Columns in a Data Frame
The dim() function is used in the R programming language to count the rows and columns in a Data Frame. Let us see an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# Create a Data Frame # We have four columns in the Data Frame my_DataFrame <- data.frame ( ID = c("S01", "S02", "S03", "S04", "S05"), Name = c("Amit", "John", "David", "Virat", "Jacob"), Marks = c(99, 90, 85, 97, 78), Points = c(150.40, 120.30, 105.50, 135.60, 100.50) ) # Display the DataFrame my_DataFrame # Get the count of rows and columns using the dim() dim(my_DataFrame) |
Output
1 2 3 4 5 6 7 8 9 |
ID Name Marks Points 1 S01 Amit 99 150.4 2 S02 John 90 120.3 3 S03 David 85 105.5 4 S04 Virat 97 135.6 5 S05 Jacob 78 100.5 [1] 5 4 |
Length of a Data Frame
To get the length of a Data Frame, use the length() function in the R programming language. The length gets the count of columns of a Data Frame. Let us see an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# Create a Data Frame # We have four columns in the Data Frame my_DataFrame <- data.frame ( ID = c("S01", "S02", "S03", "S04", "S05"), Name = c("Amit", "John", "David", "Virat", "Jacob"), Marks = c(99, 90, 85, 97, 78), Points = c(150.40, 120.30, 105.50, 135.60, 100.50) ) # Display the Data Frame my_DataFrame # Get the length of the Data Frame # The length gets the count of columns of a Data Frame length(my_DataFrame) |
Output
1 2 3 4 5 6 7 8 9 |
ID Name Marks Points 1 S01 Amit 99 150.4 2 S02 John 90 120.3 3 S03 David 85 105.5 4 S04 Virat 97 135.6 5 S05 Jacob 78 100.5 [1] 4 |
Summarize the Data in the Data Frame
The summary() function is used to summarize the data in the R Data Frame. The summary is displayed in the form of mean, median, minimum value, maximum value, etc. Let us see an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# Create a Data Frame # We have four columns in the Data Frame my_DataFrame <- data.frame ( ID = c("S01", "S02", "S03", "S04", "S05"), Name = c("Amit", "John", "David", "Virat", "Jacob"), Marks = c(99, 90, 85, 97, 78), Points = c(150.40, 120.30, 105.50, 135.60, 100.50) ) # Display the DataFrame my_DataFrame # Get the summary of the Data Frame summary(my_DataFrame) |
Output
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
ID Name Marks Points 1 S01 Amit 99 150.4 2 S02 John 90 120.3 3 S03 David 85 105.5 4 S04 Virat 97 135.6 5 S05 Jacob 78 100.5 ID Name Marks Points S01:1 Amit :1 Min. :78.0 Min. :100.5 S02:1 David:1 1st Qu.:85.0 1st Qu.:105.5 S03:1 Jacob:1 Median :90.0 Median :120.3 S04:1 John :1 Mean :89.8 Mean :122.5 S05:1 Virat:1 3rd Qu.:97.0 3rd Qu.:135.6 Max. :99.0 Max. :150.4 |
If you liked the tutorial, spread the word and share the link and our website Studyopedia with others.
For Videos, Join Our YouTube Channel: Join Now
Read More:
No Comments