Melting (and its counterpart – casting) is one of the most interesting capabilities in R. It allows you to reshape your data in various ways. In particular, this guide will show how the melt() function in R works.
melt() Function In R
As data engineers and scientists, you are likely to be familiar with the wide format of data structures. In this format, data is represented across many columns, with each of them corresponding to a specific variable.
Many R functions require you to stretch this data, making each participant occupy not one but multiple rows. This is called the long format, and melt() can help you transform your wide data into it.
This function belongs to the reshape and its reboot, the reshape2 package. They are created by Hadley Wickham, author of the ggplot2 package.
You will need to install and load reshape2 before using melt():
melt() is a generic function with melt.data.frame(), melt.array(), and melt.list() as its extended methods.
Let’s get started by creating a sample data frame from the built-in mtcars dataset. In this example, we use only the first three variables and remove the rest of the columns.
df = subset(head(mtcars), select = -c(hp, drat, wt, qsec, vs, am, gear, carb)) df
mpg cyl disp Mazda RX4 21.0 6 160 Mazda RX4 Wag 21.0 6 160 Datsun 710 22.8 4 108 Hornet 4 Drive 21.4 6 258 Hornet Sportabout 18.7 8 360 Valiant 18.1 6 225
In each data frame, each car model is represented by a row and three columns, which contain data about the miles per gallon, number of cylinders, and displacement.
Note: learn more about the head() function here.
You can “melt” this data frame and make it narrower with the melt() function. By default, it doesn’t need any other arguments other than your data:
Using as id variables variable value 1 mpg 21.0 2 mpg 21.0 3 mpg 22.8 4 mpg 21.4 5 mpg 18.7 6 mpg 18.1 7 cyl 6.0 8 cyl 6.0 9 cyl 4.0 10 cyl 6.0 11 cyl 8.0 12 cyl 6.0 13 disp 160.0 14 disp 160.0 15 disp 108.0 16 disp 258.0 17 disp 360.0 18 disp 225.0
As you can see, the output is a data frame containing only two rows where specifications of car models are stacked on each other. This is when melt() uses the row labels to melt your data.
The result is a data frame with only two columns: variable and value. Each row is an instance of a value. Now your data has become longer, and participants are no longer represented by a single row.
If you want to keep certain columns, you can specify them with the id argument. For instance, this command allows you to retain the mpg column while reshaping the other two:
mpg variable value 1 21.0 cyl 6 2 21.0 cyl 6 3 22.8 cyl 4 4 21.4 cyl 6 5 18.7 cyl 8 6 18.1 cyl 6 7 21.0 disp 160 8 21.0 disp 160 9 22.8 disp 108 10 21.4 disp 258 11 18.7 disp 360 12 18.1 disp 225
After melting your data, you can convert it back to the original shape with the cast() function. It accepts the elongated data frame you have created with melt():
df2 <- melt(df, id = c("mpg")) cast(df2, mpg~variable, mean)
mpg cyl disp 1 18.1 6 225 2 18.7 8 360 3 21.0 6 160 4 21.4 6 258 5 22.8 4 108
The melt() function in R can reshape your data into fewer columns, which can be a requirement for many functionalities in this language.
My name is Robert. I have a degree in information technology and two years of expertise in software development. I’ve come to offer my understanding on programming languages. I hope you find my articles interesting.
Name of the university: HUST