dcast Function In R: Reshape data.table

dcast in r

This tutorial will discuss using the dcast() function in the R programming language. So, read the syntax and code example below to understand it better.

What is the dcast function in R?

The dcast() function in R languages uses reshaping data.table, such as summarizing the data for selecting groups or sorting the rows based on specified criteria. So, you can see the syntax below to know how to use this function.

The syntax of dcast function in R:

dcast(data,
      fun.aggregate = NULL,
      value.var = guess(data))

Parameters:

  • data: The data is a data table.
  • fun.aggregate: The formula, but if the formula does not identify and return a single observation for each cell, aggregation falls back to length with a message.
  • value.var: The name of the column whose values will be cast. If no column is specified, the function’ guess()’ attempts to guess it automatically. It is possible to cast several ‘value.var’ columns simultaneously.

How to use the dcast in R?

First, we need to install the ‘data.table’ package before using the dcast() function. You can run the code below.

# Install the package
install.packages("data.table")

# Load the package
library("data.table")          

After installing ‘data.table’ package, we will create a data table as follows:

# Load the package
library("data.table")

# Create a data table
dt <- setDT(data.frame(
    ID = c(1, 2, 1, 3, 1, 3, 1, 3, 1, 3, 2, 1, 3, 1, 2),
    Classes = c(
        "a", "b", "a", "a", "b", "b", "b",
        "c", "c", "a", "a", "b", "a", "a", "b"
    ),
    Test1 = c(8, 9, 5, 6, 5, 9, 6, 3, 2, 8, 10, 6, 3, 5, 4),
    Test2 = c(5, 6, 3, 6, 9, 8, 5, 4, 6, 3, 9, 7, 5, 1, 0)
))

# View a data table
head(dt)

Output

   ID Classes Test1 Test2
1:  1       a     8     5
2:  2       b     9     6
3:  1       a     5     3
4:  3       a     6     6
5:  1       b     5     9
6:  3       b     9     8

Using the dcast function to the Group Mean

Here, we will use the dcast() function in R to the group mean in this data table above. Follow the code example below:

# Load the package
library("data.table")

# Create a data table
dt <- setDT(data.frame(
    ID = c(1, 2, 1, 3, 1, 3, 1, 3, 1, 3, 2, 1, 3, 1, 2),
    Classes = c(
        "a", "b", "a", "a", "b", "b", "b",
        "c", "c", "a", "a", "b", "a", "a", "b"
    ),
    Test1 = c(8, 9, 5, 6, 5, 9, 6, 3, 2, 8, 10, 6, 3, 5, 4),
    Test2 = c(5, 6, 3, 6, 9, 8, 5, 4, 6, 3, 9, 7, 5, 1, 0)
))

dt1 <- dcast(
  	dt, ID + Classes ~ .,
    fun.aggregate = mean,
    value.var = "Test1"
)

# View a data table
head(dt1)

Output

   ID Classes         .
1:  1       a  6.000000
2:  1       b  5.666667
3:  1       c  2.000000
4:  2       a 10.000000
5:  2       b  6.500000
6:  3       a  5.666667

Using the dcast function with Multiple Functions

In this example, we will use the dcast() function with multiple functions. Check out the code below:

# Load the package
library("data.table")

# Create a data table
dt <- setDT(data.frame(
    ID = c(1, 2, 1, 3, 1, 3, 1, 3, 1, 3, 2, 1, 3, 1, 2),
    Classes = c(
        "a", "b", "a", "a", "b", "b", "b",
        "c", "c", "a", "a", "b", "a", "a", "b"
    ),
    Test1 = c(8, 9, 5, 6, 5, 9, 6, 3, 2, 8, 10, 6, 3, 5, 4),
    Test2 = c(5, 6, 3, 6, 9, 8, 5, 4, 6, 3, 9, 7, 5, 1, 0)
))

dt1 <- dcast(
  	dt, ID + Classes ~ .,
    fun.aggregate = list(mean, sum),
    value.var = "Test1"
)

# View a data table
head(dt1)

Output

   ID Classes Test1_mean Test1_sum
1:  1       a   6.000000        18
2:  1       b   5.666667        17
3:  1       c   2.000000         2
4:  2       a  10.000000        10
5:  2       b   6.500000        13
6:  3       a   5.666667        17

Using the dcast with multiple variables with numeric values

You can see the code example below to understand this example better.

# Load the package
library("data.table")

# Create a data table
dt <- setDT(data.frame(
    ID = c(1, 2, 1, 3, 1, 3, 1, 3, 1, 3, 2, 1, 3, 1, 2),
    Classes = c(
        "a", "b", "a", "a", "b", "b", "b",
        "c", "c", "a", "a", "b", "a", "a", "b"
    ),
    Test1 = c(8, 9, 5, 6, 5, 9, 6, 3, 2, 8, 10, 6, 3, 5, 4),
    Test2 = c(5, 6, 3, 6, 9, 8, 5, 4, 6, 3, 9, 7, 5, 1, 0)
))

dt1 <- dcast(
    dt, ID + Classes ~ .,
    fun.aggregate = list(mean, var),
    value.var = c("Test1", "Test2")
)

# View a data table
head(dt1)

Output

   ID Classes Test1_mean Test2_mean  Test1_var Test2_var
1:  1       a   6.000000   3.000000  3.0000000  4.000000
2:  1       b   5.666667   7.000000  0.3333333  4.000000
3:  1       c   2.000000   6.000000         NA        NA
4:  2       a  10.000000   9.000000         NA        NA
5:  2       b   6.500000   3.000000 12.5000000 18.000000
6:  3       a   5.666667   4.666667  6.3333333  2.333333

Summary

This tutorial will demonstrate how to use the dcast() function in the R programming language to reshape a data table. Please leave a comment if you have any queries. Have a great day, and see you again!

Maybe you are interested:

Posted in R

Leave a Reply

Your email address will not be published. Required fields are marked *