The complete.cases() function in R

complete.cases in r

A data engineer spends almost time on data processing, and dealing with missing data is one of them. There are two main methods to handle missing data: eliminating them or creating new values based on existing values. This article will show you how to eliminate missing data by using the complete.cases() function in R.

What is the complete.cases() function in R?

The complete.cases() function eliminates missing values in a vector, matrix, or dataframe and returns the one with no missing data. When applying the function to a matrix or dataframe, all rows having missing values will be dropped.

Syntax: 

new_object = object[complete.cases(object)]

Parameters: 

  • object: Vector, matrix, or dataframe that has missing values
  • new_object: New vector, matrix, or dataframe that has no missing data

How to use the complete.cases() function?

Now, we will show you a few examples to use the complete.cases() function with vectors, matrices, and dataframe, respectively. 

Eliminate missing data in a vector

Starting with a vector – unit to create a matrix or dataframe. We will declare an integer vector with a few NA values, then apply the function to see the result.

Code:

# Create a raw vector with a few NA values
rawVect <- c(1, 4, 3, 5, NA, 6, 9, NA, 0)

# Eliminate Na values
cleanVect <- rawVect[complete.cases(rawVect)]

cat("The vector after handling missing values is:\n")
print(cleanVect)

Result:

The vector after handling missing values is:
[1] 1 4 3 5 6 9 0

Eliminate missing data in a matrix

When applying the function to remove missing values in a matrix, all the rows that have missing values are also removed. As a result, the function returns a vector representing values in rows with no missing values.

Code:

# Create a vector with a few NA values
vect <- c(1, 2, NA, NA, 5, 6, 7, 8, NA)

# Create a 3x3 matrix from the vector
m <- matrix(vect, nrow = 3, ncol = 3)

cat("The original matrix is:\n")
print(m)

newVect <- m[complete.cases(m)]

cat("The result after eliminating missing values in the matrix is:\n")
print(newVect)

Result:

The original matrix is:
     [,1] [,2] [,3]
[1,] 1 NA 7
[2,] 2 5 8
[3,] NA 6 NA
The result after eliminating missing values in the matrix is:
[1] 2 5 8

The first and third rows are removed because they have missing values.

Eliminate missing data in a dataframe

Similar to matrices, when applying the function to a dataframe, all rows that have missing values will also be removed.

Code:

# Declare vectors having missing values
v1 <- c(5, 17, NA, 12, 32, NA)
v2 <- c(12, 23, 19, NA, NA, 0)
v3 <- c(4, 32, 11, NA, 21, NA)

# Create a dataframe from vectors
df <- data.frame(v1, v2, v3)

# -> df
# v1 v2 v3
# 1 5 12 4
# 2 17 23 32
# 3 NA 19 11
# 4 12 NA NA
# 5 32 NA 21
# 6 NA 0 NA

# Drop rows having missing values
df <- df[complete.cases(df), ]

cat("The dataframe after eliminateing missing values\n")
print(df)

Result:

The dataframe after eliminating missing values 
v1 v2 v3
1 5 12 4
2 17 23 32

Summary

In summary, the complete.cases() function removes missing values in a hard way. The function removes not only missing values but also remove rows containing missing values in matrices or data frames.

Maybe you are interested:

Posted in R

Leave a Reply

Your email address will not be published. Required fields are marked *