How To Use The colMeans In R

colmeans in r

During data analysis, you will often be asked to calculate the mean of one or more variables in a data frame. The R programming language provides the colMeans() function that can help you with this requirement. Please read this article to learn about colMeans in R and how to use it. Let’s go.

What is the colMeans in R?

The colMeans() function in R calculates the average value of columns in a data frame or matrix.

The colMeans() function returns a numeric vector containing the average value of each column.

Syntax:

colMeans(dataframe[c(...)], na.rm)

Parameters:

  • dataframe[c(…)]: several columns in a dataframe or numeric vector.
  • na.rm: discard NA values. The default is FALSE.

How to use the colMeans() function in R?

Here we will give some examples to show how to use the colMeans() function in practice.

Mean of all columns

Suppose we set the first parameter to the name of the matrix or data frame without selecting specific columns for that data frame. The colMeans() function will return the means of all columns in the input object.

In the following example, we have a numeric matrix with 4 rows and 4 columns consisting of 16 integers from 1 to 16.

We will use the colMeans() function to calculate the mean of all the columns of this matrix.

Example:

# Create a numeric matrix
num_matrix <- matrix(1:16, nrow = 4)

# Calculate the means of all columns
cat("Means of all columns\n")
means <- colMeans(num_matrix)
means

Output:

Means of all columns
[1]  2.5  6.5 10.5 14.5

Mean of specific columns

If we use the colMeans() function in a data frame with some non-numeric values, there will be an error. We must ensure that only columns with numeric values are selected to avoid errors.

We have a data frame that includes several students’ names and test scores.

In the following example, we will use the colMeans() function to calculate the mean score of each subject for all students in the list.

Example:

# Create a data frame
scores <- data.frame(
    Name = c(
        "Ali",
        "Beatriz",
        "Charles",
        "Diya",
        "Eric",
        "Fatima",
        "Gabriel",
        "Hanna"
    ),
    Math = c(54, 72, 68, 44, 26, 92, 88, 56),
    Biology = c(42, 70, NA, 34, 60, 84, 94, 42),
    English = c(44, 74, 82, 56, 62, 84, 68, 76),
    Physics = c(24, 36, 44, 38, 52, 28, 98, 46)
)

# Calculate the means score of each subject for all students
mean <- colMeans(scores[c("Math", "Biology", "English", "Physics")], na.rm = TRUE)

# Returns the same result
# mean <- colMeans(scores[c(2,3,4,5)], na.rm = TRUE)
mean

Output:

    Math  Biology  English  Physics 
62.50000 60.85714 68.25000 45.75000

The na.rm parameter is set to TRUE to ignore NA values.

Summary

So we have shared with you how to use colMeans in R. You must make sure the input columns must be numeric, or else an error will occur. If the input contains NA values, set the na.rm parameter to TRUE. Thanks for reading.

Maybe you are interested:

Posted in R

Leave a Reply

Your email address will not be published. Required fields are marked *