How To Merge By Two Columns In R

r merge by two columns

Hi guys! Today we will show you how to merge by two columns in R. When working with this language, you often have a need to do this, so you should pay attention to it.

How to merge by two columns in R

Have you ever wanted to perform an R merge by two columns in common of two data frames? If that is what you want, then this tutorial can help you to achieve the goal. There are some functions to merge by two collumns in R. Most programmers will choose between those two methods: using inner_join() and using merge().

Using inner_join() 

To merge by two columns in R, you can use the inner_join() function. But before using it, you will need to import tidyverse (make sure to install this package first) to use:

Syntax:

inner_join(x, y, by = c(col1,col2))

Parameters:

  • x: The first data frame.
  • y: The second data frame.
  • col1: The first column.
  • col2: The second column.

Suppose you have two data frames stored in df1, df2.

# Create a data frame 1
df1 = data.frame(Name=c("John","Kay","James"),
                 Gender=c("male","male","female"),
                 Salary=c(123,234,345),
                 Company=c("ABC","XYZ","OOO"))
 
# Display df1
df1
 
# Create a data frame 2
df2 = data.frame(Name=c("John","Kile","James"),
                 Gender=c("male","male","female"),
                 Age=c(23,34,45),
                 Company=c("ABC","XYZ","LEARNSHAREIT"))
 
# Display df2
df2 

Now you have to merge them by the first two columns using inner_join:

library (tidyverse) 
 
# Merge by preceding two columns 
inner_join(df1,df2,by=c("Name","Gender"))

Output

> df1
   Name Gender Salary Company
1  John   male    123     ABC
2   Kay   male    234     XYZ
3 James female    345     OOO
> 
> df2
   Name Gender Age      Company
1  John   male  23          ABC
2  Kile   male  34          XYZ
3 James female  45 LEARNSHAREIT
> 
   Name Gender Salary Company.x Age    Company.y
1  John   male    123       ABC  23          ABC
2 James female    345       OOO  45 LEARNSHAREIT

As can be seen, after merging two data frames by two columns which are “Name” and “Gender”, we receive a table as a result above. If you want to merge by all columns, that means a full join without any “by” argument, then you can look at the following example:

library (tidyverse) 
 
# Merge by all columns
inner_join(df1,df2)

Output

Name Gender Salary Company Age
1 John   male    123     ABC  23

The example above shows that we have removed the by argument in the inner_join() function, it will automatically merge the data frames by keeping the data that has the same value in the same columns of each table. 

Using merge()

There is another way of r merge by two columns. That is using R merge() function. You can get its syntax here. We will use the same data frames of the previous solution to make example for this:

# Merge two data frames by preceding two columns 
df = merge(df1,df2,by=c("Name","Gender"))

Output

> df
   Name Gender Salary Company.x Age    Company.y
1 James female    345       OOO  45 LEARNSHAREIT
2  John   male    123       ABC  23          ABC

Can you see the difference between this merge() function in R and the previous solution. That is this R merge function doesn’t require a package. However, it still produces the same results as the first one. In addition, if you want to merge by more columns, just input it in the by argument: 

# Merge two data frames by three columns 
df = merge(df1,df2,by=c("Name","Gender","Company"))

Output

> df
  Name Gender Company Salary Age
1 John   male     ABC    123  23

Summary

We have learned how to merge by two columns in R using two different approaches. It would help if you remember that the second method is the most effective for two data frame’s cases. You can find more knowledge of R with us here.

Maybe you are interested:

Posted in R

Leave a Reply

Your email address will not be published. Required fields are marked *