The pipe operator in R is an operation that helps you pass arguments or functions in sequence. If you are new to this and want to find out more, please check out our instructions below because we will show you what it is, its application, and how to apply it to your R program.
Pipe operator in R
What does the pipe operator do in R?
The pipe operator in R is part of the ‘dplyr‘ and ‘magrittr’ packages. Its function is to make you apply many arguments or functions in a data frame in the sequence. Using the pipe operator, your code will look simpler and easier to read. Let’s take a look at this example:
We will use the mtcars data frame, which is a built-in data frame in R.
Now, if we want to summarize this table, usually, we would do the following:
mpg cyl disp hp Min. :10.40 Min. :4.000 Min. : 71.1 Min. : 52.0 1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8 1st Qu.: 96.5 Median :19.20 Median :6.000 Median :196.3 Median :123.0 Mean :20.09 Mean :6.188 Mean :230.7 Mean :146.7 3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0 3rd Qu.:180.0 Max. :33.90 Max. :8.000 Max. :472.0 Max. :335.0 drat wt qsec vs Min. :2.760 Min. :1.513 Min. :14.50 Min. :0.0000 1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89 1st Qu.:0.0000 Median :3.695 Median :3.325 Median :17.71 Median :0.0000 Mean :3.597 Mean :3.217 Mean :17.85 Mean :0.4375 3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90 3rd Qu.:1.0000 Max. :4.930 Max. :5.424 Max. :22.90 Max. :1.0000 am gear carb Min. :0.0000 Min. :3.000 Min. :1.000 1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:2.000 Median :0.0000 Median :4.000 Median :2.000 Mean :0.4062 Mean :3.688 Mean :2.812 3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:4.000 Max. :1.0000 Max. :5.000 Max. :8.000
However, with the pipe operator, it would be:
data %>% summary()
You might think, “But this is not simpler to read!”. That’s true if we have only one function. What if we want to perform multiple functions on a data frame?
In this new example, we will first try to get only the data about hp and mpg of the data set, then filter only the cars whose hp is more significant than 100, and finally, summarize that new data set. So, here is the normal code:
summary(filter(select(mtcars, mpg, hp), hp>100))
mpg hp Min. :10.40 Min. :105.0 1st Qu.:15.10 1st Qu.:118.0 Median :17.30 Median :175.0 Mean :17.45 Mean :174.2 3rd Qu.:19.45 3rd Qu.:210.0 Max. :30.40 Max. :335.0
However, if we use pipe operator:
mtcars %>% select(mpg, hp) %>% filter(hp>100) %>% summary()
As we can see, the code is much easier to read and understand.
How to use pipe operator in R?
Install the library
The pipe operator is most commonly known in the ‘dplyr’ and ‘magrittr’ packages, so we have to install and load the packages before using the pipe operator. I will take the ‘dplyr’ packages as an example in this tutorial.
To install the ‘dplyr’ packages, simply type in:
Next, load the package by typing in:
Now everything is set and ready to go.
Use the pipe operator
The pipe operator is defined as the
"%>%" sign. You will have the name of the data frame first then the function later. Which function you want to use first will come first.
So for example:
library("dplyr") # Load the built-in Orange function Orange %>% # Get the first 10 rows slice(1:10) %>% # Summarize the data summary()
Tree age circumference 3:0 Min. : 118.0 Min. : 30.00 1:7 1st Qu.: 484.0 1st Qu.: 60.75 5:0 Median : 664.0 Median : 99.00 2:3 Mean : 772.1 Mean : 91.00 4:0 3rd Qu.:1174.2 3rd Qu.:118.75 Max. :1582.0 Max. :145.00
In this tutorial, we helped you learn about the pipe operator in R and how to use it in your program. This is a very powerful tool that makes your code cleaner and more readable.
Maybe you are interested:
Name of the university: VinUni