Hi guys! Today we will share with you a guide on how to merge data frames in R. This is the most basic skill when you work with data in R, so please take a look at the below examples and explanations of two approaches: using full_join() and using merge().
Merge data frames in R
There are different ways to merge data frames in R. Mostly, programmers usually choose two methods: using join() and using merge().
To merge two dataframes in R, you can use the full_join() function. However, this is not a built-in function, so you have to install a package named dplyr and import it to use first.
- x: This is the first data frame or objects to be merged.
- y: This is the second data frame or objects to be merged.
Suppose you have two tables stored in two csv files.
Full Name, Address Ole Grant, 2555 Hammes Ways Jayme Rippin, 754 Jerrod Jeffery Bashirian, 4253 Paucek Carolyne Krajcik, 8927 Predovic Blankman, 730 Mekhi Dr. Lydia Konopelski MD, 30005 Skiles
Full Name, Age Ole Grant, 22 Jayme Rippin, 32 Jeffery Bashirian, 27 Carolyne Krajcik, 31 Blankman, 55 Dr. Lydia Konopelski MD, 87
Now you have to merge them to get the full information:
library(dplyr) # Read file addresses.csv and assign to data frame 'addresses' addresses <- read.csv(file = 'addresses.csv', sep = ',', header = T) addresses # Read file ages.csv and assign to data frame 'ages' ages <- read.csv(file = 'ages.csv', sep = ',', header = T) ages # Merge two data frames full_join(addresses, ages)
Full Name Address Age 1 Ole Grant 2555 Hammes Ways 22 2 Jayme Rippin 754 Jerrod 32 3 Jeffery Bashirian 4253 Paucek 27 4 Carolyne Krajcik 8927 Predovic 31 5 Blankman 730 Mekhi 55 6 Dr. Lydia Konopelski MD 30005 Skiles 87
In lines 3: We have defined a table named addresses which contains the full name and their addresses of them.
In Lines 7: We also define a data frame named ages containing the information of their age.
Line 12: We use the full_join function to perform merging data frames in R.
As you can see, after merging, we receive a table with three columns; the first column is the full name and the second is the corresponding address of them, and the third one is the corresponding age of them.
There is another way to merge two data frames in R, which is using the R merge function. Its syntax and usage can be found here. We can reuse the data frames in the above solution to make an example for this one:
# Read file addresses.csv and assign to data frame 'addresses' addresses <- read.csv(file = 'addresses.csv', sep = ',', header = T) addresses # Read file ages.csv and assign to data frame 'ages' ages <- read.csv(file = 'ages.csv', sep = ',', header = T) ages # Merge two data frames merge(addresses, ages)
Full Name Address Age 1 Blankman 730 Mekhi 55 2 Carolyne Krajcik 8927 Predovic 31 3 Dr. Lydia Konopelski MD 30005 Skiles 87 4 Jayme Rippin 754 Jerrod 32 5 Jeffery Bashirian 4253 Paucek 27 6 Ole Grant 2555 Hammes Ways 22
Do you see any difference between the two outputs in each solution? They are different in the order of the rows, which is a result of the different algorithm when implementing the function. However, the corresponding information after merging for each full name doesn’t change and hence this solution also works correctly.
Actually you can consider using any function that you want. But this solution doesn’t require you to import any package and therefore you just need to call the R merge() function to use it because it is built-in.
We have learned how to merge data frames in R by using two different methods. It would help if you considered that the second approach has more advantages than others. You can learn more about R with us here.
Maybe you are interested:
- For Loops In R: The Syntax And Code Example
- Merge In R: What is merge() in R and Example
- Merge multiple data frames in r
Name of the university: HCMUT