 # What Is The summary() Function In R? In this article, we will learn about the summary() function in R and how to use this function with different types of input data. Let’s go into detail now.

## What is the summary() function in R?

The summary() function is used to generate a summary of the results of statistical calculations that summarize data and model objects.

Syntax:

summary(data)

Parameter:

• data: a vector, data frame, linear regression model,…

The return value of the summary() function in R will depend on the data type being processed.

## How to use this function in R?

We will give four specific examples of how to use the summary() function in R.

### The input data is a vector

First, we create a numeric vector named ‘dt’ containing the first ten elements of the Fibonacci sequence:

0, 1, 1, 2, 3, 5, 8, 13, 21, 34

Finally, use the summary() function to get the vector ‘dt’ summary.

Code:

# Create a vector containing the first ten elements of the Fibonacci sequence
fibo = c(0, 1, 1, 2, 3, 5, 8, 13, 21, 34)
print(fibo)

# Get the summary of the 'fibo' vector
summaryFibo <- summary(fibo)
print(summaryFibo)

Output:

   0  1  1  2  3  5  8 13 21 34
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
0.00    1.25    4.00    8.80   11.75   34.00 

The summary() function returns 6 parameters:

Min.     The minimum value

1st Qu.     The first quantile value

Median     The median value

Mean     The mean value

3rd Qu.     The third quantile value

Max.     The maximum value

### The input data is a data frame

We have a data frame named ‘score’ that contains the names and test scores of some students in our class.

We will use the summary() function to get a summary of all the columns belonging to ‘score’.

Example:

# Create the 'score' data frame
score <-data.frame(
Name = c("Alice", "Layla", "Parker", "Loren", "Granger"),
Math = c(45, 78, 30, 69, 64),
English = c(58, 55, 62, 87, 54),
Physics = c(88, 43, 63, 51, 28)
)

# Get the summary of the 'score' data frame
summaryScore <- summary(score)
print(summaryScore)

Output:

     Name                Math         English        Physics
Length:5           Min.   :30.0   Min.   :54.0   Min.   :28.0
Class :character   1st Qu.:45.0   1st Qu.:55.0   1st Qu.:43.0
Mode  :character   Median :64.0   Median :58.0   Median :51.0
Mean   :57.2   Mean   :63.2   Mean   :54.6
3rd Qu.:69.0   3rd Qu.:62.0   3rd Qu.:63.0
Max.   :78.0   Max.   :87.0   Max.   :88.0 

### The input data is some columns of the data frame

With the above data frame, assuming you only want to get the summary of the Math, English, and Physics columns, you can do the following:

# Create the 'score' data frame
score <-data.frame(
Name = c("Alice", "Layla", "Parker", "Loren", "Granger"),
Math = c(45, 78, 30, 69, 64),
English = c(58, 55, 62, 87, 54),
Physics = c(88, 43, 63, 51, 28)
)

# Get the summary of some columns on the 'score' data frame
summaryScore <- summary(score[c('Math', 'English', 'Physics')])
print(summaryScore)

Output:

      Math         English        Physics
Min.   :30.0   Min.   :54.0   Min.   :28.0
1st Qu.:45.0   1st Qu.:55.0   1st Qu.:43.0
Median :64.0   Median :58.0   Median :51.0
Mean   :57.2   Mean   :63.2   Mean   :54.6
3rd Qu.:69.0   3rd Qu.:62.0   3rd Qu.:63.0
Max.   :78.0   Max.   :87.0   Max.   :88.0

### The input data is a linear regression model

Example:

# Create a data frame
dataframe <- data.frame(x = c(1, 2, 3), y = c(9, 8, 7))

# Fit a linear model
result <- lm(y ~ x, dataframe)

# Summary of the model's performance and coefficients
summaryResult <- summary(result)
print(summaryResult)

Output:

Call:
lm(formula = y ~ x, data = dataframe)

Residuals:
1          2          3
2.719e-16 -5.439e-16  2.719e-16

Coefficients:
Estimate Std. Error    t value Pr(>|t|)
(Intercept)  1.000e+01  1.018e-15  9.828e+15   <2e-16 ***
x           -1.000e+00  4.710e-16 -2.123e+15    3e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6.661e-16 on 1 degrees of freedom
Multiple R-squared:      1,	Adjusted R-squared:      1
F-statistic: 4.507e+30 on 1 and 1 DF,  p-value: 2.999e-16

We do linear regression on the data frame using the lm() function in R.

The summary() function lets you see detailed information on the model’s performance and coefficients. Click here for details of the coefficients included in the summary.

## Summary

We have learned about the summary() function in R and how to use it with different input data types. The return value of this function in R will depend on the data type being processed. We hope the information in this article will be helpful to you. Thank you for reading.

Maybe you are interested:

Posted in R