# How To Use Average Function In R On Different Data Sources

Read on if you are looking for the average function in R. It bears a more mathematical name, mean(), and can be used on various data structures.

## Average Function In R

The syntax of mean() is simple. All you need to provide is the object x that contains the values you want to calculate its mean. Depending on the class of that object, the mean() function will output the arithmetic means in different ways.

The most basic example is when you have two numeric values, and you want to get their average values. This is how it can be done in R:

mean(c(4, 7))

[1] 5.5

In this case, the mean() function accepts the vector created by c() and produces the average of all numeric values stored in that vector. You can add as many elements as you want to this calculation through the c() function:

mean(c(4, 7, 8, 14, 23, 13, 4, 18, 3))

[1] 10.44444

By default, mean() will return NA if one of the values you provide is NA:

mean(c(15, 4, NA, 20, NA))

[1] NA

If you still want to get the average of all other values, you can use the na.rm argument. It is FALSE by default, but when you set it to TRUE, the mean() function will strip all NA values before making the computation.

So, this is how you can get the average value of the above vector minus the NA values:

mean(c(15, 4, NA, 20, NA), na.rm = TRUE)

[1] 13

## Average Of Lists

Suppose we have two R sequences: 1:10 and 11:20. If you use the c() function to pass it to mean(), you will get the average of all elements in the two sequences.

mean(c(0:10, 11:20))

[1] 10

This command is similar to using mean() on the sequence 0:20:

mean(c(0:20))

[1] 10

However, this isn’t the case when you put those sequences in a list:

mean(list(c(0:10), c(11:20)))

[1] NA
Warning message:
In mean.default(list(c(0:10), c(11:20))) :
argument is not numeric or logical: returning NA

R doesn’t allow mean() to take lists as its argument. If you attempt to do so, it will print out some warning messages and return NA.

There are some workarounds, depending on what values you want to get. If you want to get the average of each sequence, you can use the sapply() function. What it does is apply the function you provide to every element in a list.

l <- list(c(0:10), c(11:20))
sapply(l, mean)

[1] 5.0 15.5

In above examples, sapply() doesn’t take all elements in l at once. Instead, it runs through this list and invokes the mean() function on each element separately. As a result, it returns a list containing those averages instead of a single numeric value.

On the other hand, if you want to determine the average of every element in the list, use the unlist() function. It will flatten the list l and give you a vector containing all of its elements:

mean(unlist(l))

[1] 10

## Average Of Data Frames

In the same manner, you can’t apply mean() directly to a data frame either. It will give you the NA value alongside a warning message:

df <- head(mtcars)
mean(df)

[1] NA
Warning message:
In mean.default(df) : argument is not numeric or logical: returning NA

However, you can invoke it to a single column to get its average:

mean(df\$hp)

[1] 117.1667

To do this to multiple columns in a data frame, use the mapply() function:

> mapply(mean, df[,c(2:6)])

## Summary

The average function in R is mean(), which can help you calculate the average of a set of values. By combining it with other functions, you can also call it on other data structures like lists and data frames.

Maybe you are interested:

Posted in R