Hi guys! Today we will share with you a guide on how to merge in R. This is the most basic skill when you work with data in R, so please take a look at the below syntax and usage.
What is merge in R
Have you ever wanted to merge two data frames that have some columns or row names in common? That is the time you should consider using the R merge()
function to achieve. This function can also do different operations of database joining on your data. Therefore, this function can be considered the same as the join() function in R.
Syntax of merge() in R
merge(x, y, by = intersect(names(x), names(y)), by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all, sort = TRUE, suffixes = c(".x", ".y"), no.dups = TRUE, incomparables = NULL, …)
Parameters:
x, y | Data frames to be merged |
by, by.x, by.y | Declarations of columns to be merged. |
all | logical TRUE or FALSE . |
all.x | logical TRUE or FALSE |
all.y | logical TRUE or FALSE analogous to all.x . |
sort | logical TRUE or FALSE |
suffixes | a character vector of length 2 |
no.dups | logical TRUE or FALSE |
incomparables | values that can’t be matched. |
… | … |
Merge() function in R example
You can merge datasets by using the function merge
() in R and its optional parameters. Take a look at the following examples
Example 1
Suppose an accountant is making a statistic about the income of the top 3 authors in a company. The data she has is a table of all the author’s rating points and a table containing the earning bonus of the top 3 highest-scored authors. We can easily view a table of the top 3’s income by merging the two tables:
# Create a table points (line 1) points = data.frame( Author = c( "Karley Crooks MD", "Rene Jacobs", "Stephen Haag", "Prof. Juvenal Ritchie", "Sonia Koepp III" ), Points = c("35.87", "32", "28.87", "27.71", "27.42") ) cat('points\n') points # Create the table bonus (line 14) bonus = data.frame( Author = c("Karley Crooks MD", "Stephen Haag", "Rene Jacobs"), Bonus = c(1000, 800, 500) ) cat('bonus\n') bonus # Merge two tables (line 20) statistic = merge(points, bonus) cat('statistic\n') statistic
Output
points
Author Points
1 Karley Crooks MD 35.87
2 Rene Jacobs 32
3 Stephen Haag 28.87
4 Prof. Juvenal Ritchie 27.71
5 Sonia Koepp III 27.42
bonus
Author Bonus
1 Karley Crooks MD 1000
2 Stephen Haag 800
3 Rene Jacobs 500
statistic
Author Points Bonus
1 Karley Crooks MD 35.87 1000
2 Rene Jacobs 32 500
3 Stephen Haag 28.87 800
In lines 2-11: We have defined a dataset named points to represent an Author and the corresponding points of them.
In lines 17-20: We also define a table named bonus that contains the bonus money corresponding to the top 3 authors with the highest points.
Line 26: We use the merge function to get a statistical table pay by a natural join between two previous datasets.
As we have declared before, ‘Prof. Juvenal Ritchie’ and ‘Sonia Koepp III’ were not in both tables. As a result, in the output of this function they appear missing. However, if the accountant (in our context) here want to extract the table with the top 5 and do not care about the last 2 authors’ bonus, so she should have to follow the next example to know how to do it.
Example 2
Suppose the context is the same as the previous one. But now we can easily view a table of the top 5’s income by merging the two tables:
# Create a table points (line 1) points = data.frame( Author = c( "Karley Crooks MD", "Rene Jacobs", "Stephen Haag", "Prof. Juvenal Ritchie", "Sonia Koepp III" ), Points = c("35.87", "32", "28.87", "27.71", "27.42") ) cat("points\n") points # Create the table bonus (line 14) bonus = data.frame( Author = c("Karley Crooks MD", "Stephen Haag", "Rene Jacobs"), Bonus = c(1000, 800, 500) ) cat("bonus\n") bonus # Merge two tables (line 20) statistic = merge(points, bonus, all = TRUE) cat("statistic\n") statistic
Output
points
Author Points
1 Karley Crooks MD 35.87
2 Rene Jacobs 32
3 Stephen Haag 28.87
4 Prof. Juvenal Ritchie 27.71
5 Sonia Koepp III 27.42
bonus
Author Bonus
1 Karley Crooks MD 1000
2 Stephen Haag 800
3 Rene Jacobs 500
statistic
Author Points Bonus
1 Karley Crooks MD 35.87 1000
2 Prof. Juvenal Ritchie 27.71 NA
3 Rene Jacobs 32 500
4 Sonia Koepp III 27.42 NA
5 Stephen Haag 28.87 800
So have you seen the difference between the previous output and this one? In fact, there is a change like that because in line 26 we use the merge function with three parameters instead of two. This will perform a FULL join between two datasets.
As you can see, using this approach also helps you achieve the merged data, and the result will contain the row that cannot be merged because it is not presented in another data. But you must remember to pass the argument “all = TRUE” when calling the R merge function.
Summary
We have learned how to merge in R by using the R merge function, you can easily do the task. If you have any questions, feel free to provide your comments below. We also have lots of tutorials about R which you can find more.
Maybe you are interested:
- How To Merge Data Frames In R
- Merge multiple data frames in r
- Sum Function In R: How To Use Sum() In R

I’m Edward Anderson. My current job is as a programmer. I’m majoring in information technology and 5 years of programming expertise. Python, C, C++, Javascript, Java, HTML, CSS, and R are my strong suits. Let me know if you have any questions about these programming languages.
Name of the university: HCMUT
Major: CS
Programming Languages: Python, C, C++, Javascript, Java, HTML, CSS, R