What Is The aov() Function In R And How To Use It?

aov function in r

In this article, we will learn about the aov() function in R, what it is and how to use it to fit the model analysis of variance? Let’s go into detail now.

What is the aov() function in R?

The aov() function is used to fit the model analysis of variance by the ‘lm’ call for each stratum.

The following is the basic syntax of the aov() function:

aov(formula, data)

Parameters:

  • formula: a formula specifying the model.
  • data: a data frame. The default is NULL.

How to use the aov() function in R?

We will learn how to use the aov() function to fit the model analysis of variance and then use the summary() function to view the results in the analysis of variance summary table.

We used data from an orange tree planting experiment with the tree’s age (days) and the corresponding stem diameter.

Fit the model analysis of variance

The aov() function takes as its first parameter the ‘formula’, which is a symbolic description of the model. For example, y ~ x is a formula that says y depends on x. The second parameter is the ‘data’, which defines the dataframe where x and y are found.

We performed a one-way analysis of variance to see the difference in growth rates of two orange trees.

Example:

# Create the 'orange' data frame
orange <- data.frame(
    tree = gl(2, 7),
    age = c(120, 480, 660, 1000, 1230, 1370, 1580, 120, 480, 660, 1000, 1230, 1370, 1580),
    diameter = c(30, 58, 87, 115, 120, 142, 145, 33, 54, 89, 120, 124, 147, 151)
)

print(orange)

# Fit the model analysis of variance
aov(diameter ~ age + tree, orange)

Output:

   tree  age diameter
1     1  120       30
2     1  480       58
3     1  660       87
4     1 1000      115
5     1 1230      120
6     1 1370      142
7     1 1580      145
8     2  120       33
9     2  480       54
10    2  660       89
11    2 1000      120
12    2 1230      124
13    2 1370      147
14    2 1580      151
Call:
   aov(formula = diameter ~ age + tree, data = orange)

Terms:
                      age      tree Residuals
Sum of Squares  22932.406    31.500   779.023
Deg. of Freedom         1         1        11

Residual standard error: 8.415476
Estimated effects may be unbalanced

View a summary of the model

Next, we use the summary() function to see a summary of the model’s performance and coefficients.

Example:

# Create the 'orange' data frame
orange <- data.frame(
    tree = gl(2, 7),
    age = c(120, 480, 660, 1000, 1230, 1370, 1580, 120, 480, 660, 1000, 1230, 1370, 1580),
    diameter = c(30, 58, 87, 115, 120, 142, 145, 33, 54, 89, 120, 124, 147, 151)
)

# Fit the model analysis of variance
(orangeAovModel <- aov(diameter ~ age + tree, orange))

# See a summary of the model
summary(orangeAovModel)

Output:

Call:
   aov(formula = diameter ~ age + tree, data = orange)

Terms:
                      age      tree Residuals
Sum of Squares  22932.406    31.500   779.023
Deg. of Freedom         1         1        11

Residual standard error: 8.415476
Estimated effects may be unbalanced
           Df Sum Sq Mean Sq F value   Pr(>F)    
age          1  22932   22932 323.811 1.65e-09 ***
tree         1     31      31   0.445    0.519    
Residuals   11    779      71                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Summary

This article shared the aov() function in R and how to use it to fit the model analysis of variance. We hope the information in this article has been helpful to you. Thank you for reading.

Maybe you are interested:

Posted in R

Leave a Reply

Your email address will not be published. Required fields are marked *