How To Merge Rows In R

merge rows in r

Today, we will learn how to merge rows in R. This is the most basic knowledge when you work with data frames in R. Let’s read this article now to understand more.

Solutions to merge rows in R

It is a fact that there are many functions built-in and packages that can merge rows in R. However, we will show you the two most used methods that are fill() and cbind().

Using fill() 

To merge rows in R, you can use the fill() function. But this function must be imported from packages tidyr before use, so ensures to install the package first:

Syntax

fill(data, ..., .direction = c("down", "up"))

Parameters:

  • data: The table
  • …: A selection of columns. If empty, nothing happens. 
  • .direction: Direction in which to fill missing values.

Assume you have a table stored in df:

# Create a table
df = data.frame(
    ID=c(11, 11, 22, 22, 33, 33, 44, 44),
    Salary=c(NA, 1000, NA, 2000, NA, 3000, NA, 4000),
    Name=c("Jack", "Jack", "Jane", "Jane", "John", "John", "Jay", "Jay"),
    Bonus=c(100, NA, 200, NA, 300, NA, 400, NA)
)

# Display the table
df

Output:

  ID Salary Name Bonus
1 11     NA Jack   100
2 11   1000 Jack    NA
3 22     NA Jane   200
4 22   2000 Jane    NA
5 33     NA John   300
6 33   3000 John    NA
7 44     NA  Jay   400
8 44   4000  Jay    NA

If you want to merge the rows which are disjoint and contain NA values in our table and expect the result as follows:

     ID Salary Name  Bonus
1    11   1000 Jack    100
2    22   2000 Jane    200
3    33   3000 John    300
4    44   4000 Jay     400

Then we suggest you use the function groupby() first, and then use fill() function:

library(tidyr)
library(dplyr) 
 
#Merge rows in table df
df %>%
    group_by(ID) %>% 
    fill(everything(), .direction = "downup") %>% 
    distinct()

Output

# A tibble: 4 x 4
# Groups:   ID [4]
     ID Salary Name  Bonus
  <dbl> <chr>  <dbl> <dbl>
1    11   1000 Jack    100
2    22   2000 Jane    200
3    33   3000 John    300
4    44   4000 Jay     400

The example above shows that first, we grouped the rows by their name, then filled all the missing data columns in order of down to up, and finally took the distinct records.
As can be seen, after we merge rows in R, we receive output the same as we expected. However, to use this method, you must remember to install and import the two libraries, dplyr and tidyr. Please read the next solution if you don’t want to use packages.

Using cbind()

There is a different way that can merge rows in R. By using cbind() function, you won’t have to install any packages, here is its syntax:

Syntax:

cbind (df1, df2)

Parameters:

  • df1: The first table or data frames
  • df2: The second table or data frames

This function is supposed to combine a pair of given Table, Matrix, Vector or Data Frames by columns. If your table is exactly like the structure in the previous table and you need to do this quickly for discrete purposes, you can follow this example:

# Merge by all the same rows
cbind(df[c(FALSE, TRUE), 1:2], df[c(TRUE,FALSE),3:4])

Output

 ID Salary Name Bonus
2 11   1000 Jack   100
4 22   2000 Jane   200
6 33   3000 John   300
8 44   4000  Jay   400

The logic behind this method is to take the odd rows (1, 3, 5) for columns 1 through 2 and append them to the even rows (2, 4, 6) for columns 3 to 4. However, this solution won’t work if there is a row whose NA value in Salary is out of odd order or the NA value in Bonus is not in even order. It would be best if you considered using this approach only when you are rushing and you know the structure of your data pretty well but do not want to use external libraries.

Summary

We have learned how to merge rows in R through two different approaches. You can find out more tutorials on R. We recommend you use the first solution because it will work in any cases. Good luck for you!

Maybe you are interested:

Posted in R

Leave a Reply

Your email address will not be published. Required fields are marked *