The tidyr
package provides the drop_na()
function, which returns a new dataset containing only “full” rows (no rows containing missing values). In this article, we will learn about the syntax and usage of drop_na()
in R.
What is the drop_na() in R
The drop_na()
function drops rows that contain missing values in the specified columns.
Syntax:
drop_na(data frame, ...)
Parameters:
data frame: a data frame.
…: column to drop missing values. The drop_na()
function will use all columns if this argument is omitted.
How to use the drop_na() function in R
We have a dataframe containing data about several students’ test scores. However, some students did not take the test in some subjects.
To use the drop_na()
function, you must first install and load the tidyr
package:
library('tidyr')
The following example shows us how to use the drop_na()
function in R to delete rows containing missing values.
Example:
# Create a data frame test_scores <- data.frame( Name = c( "Carlos", "Patrick", "Evans", "Tucker", "Paul", "Nicholas", "Adam", "Stuart", "Murphy", "Eleanor" ), Maths = c(65, NA, 71, 88, 66, 54, NA, 49, 92, NA), Biological = c(NA, 44, 65, NA, 77, 65, 88, 58, 48, 72), Physics = c(93, 47, 55, 42, 49, 53, 71, 51, NA, 82), English = c(74, 66, 64, 70, 82, 44, 80, NA, 68, NA) ) cat("Drop all rows containing missing values\n") drop_na(test_scores) # Drop all rows containing missing values in a specific column cat("\nDrop all rows containing missing values in the 'Maths' column\n") drop_na(test_scores, Maths) # Drop all rows containing missing values in specific columns cat("\nDrop all rows containing missing values in the 'Maths' and 'Biological' columns\n") drop_na(test_scores, Maths, Biological) cat("\nExcept for the 'Physics' column\n") drop_na(test_scores, -Physics)
Output:
Drop all rows containing missing values
Name Maths Biological Physics English
1 Evans 71 65 55 64
2 Paul 66 77 49 82
3 Nicholas 54 65 53 44
Drop all rows containing missing values in the 'Maths' column
Name Maths Biological Physics English
1 Carlos 65 NA 93 74
2 Evans 71 65 55 64
3 Tucker 88 NA 42 70
4 Paul 66 77 49 82
5 Nicholas 54 65 53 44
6 Stuart 49 58 51 NA
7 Murphy 92 48 NA 68
Drop all rows containing missing values in the 'Maths' and 'Biological' columns
Name Maths Biological Physics English
1 Evans 71 65 55 64
2 Paul 66 77 49 82
3 Nicholas 54 65 53 44
4 Stuart 49 58 51 NA
5 Murphy 92 48 NA 68
Except for the 'Math' column
Name Maths Biological Physics English
1 Evans 71 65 55 64
2 Paul 66 77 49 82
3 Nicholas 54 65 53 44
4 Murphy 92 48 NA 68
By changing the ...
argument with operators to select variables by their names, you can easily change the output of your program.
Summary
This article has shared the syntax and usage of drop_na()
in R. You can change the output according to the program’s requirements by selecting specific columns to drop rows containing missing values. Thank you for reading.

Hello, my name’s Bruce Warren. You can call me Bruce. I’m interested in programming languages, so I am here to share my knowledge of programming languages with you, especially knowledge of C, C++, Java, JS, PHP.
Name of the university: KMA
Major: ATTT
Programming Languages: C, C++, Java, JS, PHP