# The complete.cases() function in R

A data engineer spends almost time on data processing, and dealing with missing data is one of them. There are two main methods to handle missing data: eliminating them or creating new values based on existing values. This article will show you how to eliminate missing data by using the complete.cases() function in R.

## What is the complete.cases() function in R?

The complete.cases() function eliminates missing values in a vector, matrix, or dataframe and returns the one with no missing data. When applying the function to a matrix or dataframe, all rows having missing values will be dropped.

Syntax:

new_object = object[complete.cases(object)]

Parameters:

• object: Vector, matrix, or dataframe that has missing values
• new_object: New vector, matrix, or dataframe that has no missing data

## How to use the complete.cases() function?

Now, we will show you a few examples to use the complete.cases() function with vectors, matrices, and dataframe, respectively.

### Eliminate missing data in a vector

Starting with a vector – unit to create a matrix or dataframe. We will declare an integer vector with a few NA values, then apply the function to see the result.

Code:

# Create a raw vector with a few NA values
rawVect <- c(1, 4, 3, 5, NA, 6, 9, NA, 0)

# Eliminate Na values
cleanVect <- rawVect[complete.cases(rawVect)]

cat("The vector after handling missing values is:\n")
print(cleanVect)

Result:

The vector after handling missing values is:
[1] 1 4 3 5 6 9 0

### Eliminate missing data in a matrix

When applying the function to remove missing values in a matrix, all the rows that have missing values are also removed. As a result, the function returns a vector representing values in rows with no missing values.

Code:

# Create a vector with a few NA values
vect <- c(1, 2, NA, NA, 5, 6, 7, 8, NA)

# Create a 3x3 matrix from the vector
m <- matrix(vect, nrow = 3, ncol = 3)

cat("The original matrix is:\n")
print(m)

newVect <- m[complete.cases(m)]

cat("The result after eliminating missing values in the matrix is:\n")
print(newVect)

Result:

The original matrix is:
[,1] [,2] [,3]
[1,] 1 NA 7
[2,] 2 5 8
[3,] NA 6 NA
The result after eliminating missing values in the matrix is:
[1] 2 5 8

The first and third rows are removed because they have missing values.

### Eliminate missing data in a dataframe

Similar to matrices, when applying the function to a dataframe, all rows that have missing values will also be removed.

Code:

# Declare vectors having missing values
v1 <- c(5, 17, NA, 12, 32, NA)
v2 <- c(12, 23, 19, NA, NA, 0)
v3 <- c(4, 32, 11, NA, 21, NA)

# Create a dataframe from vectors
df <- data.frame(v1, v2, v3)

# -> df
# v1 v2 v3
# 1 5 12 4
# 2 17 23 32
# 3 NA 19 11
# 4 12 NA NA
# 5 32 NA 21
# 6 NA 0 NA

# Drop rows having missing values
df <- df[complete.cases(df), ]

cat("The dataframe after eliminateing missing values\n")
print(df)

Result:

The dataframe after eliminating missing values
v1 v2 v3
1 5 12 4
2 17 23 32

## Summary

In summary, the complete.cases() function removes missing values in a hard way. The function removes not only missing values but also remove rows containing missing values in matrices or data frames.

Maybe you are interested:

Posted in R