 # Scree Plot In R: How To Plot A Scree Plot

Today, we will discuss the scree plot in the R programming language. In this article, we will use the ggplot2 package in R to plot the scree plot.

## What is the scree plot in R?

We will apply PCA to the USArrests dataset and plot the scree plot. To begin, we will make a scree plot out of line plots, with the principal components and variance explained by each PC as a point connected by a line. Then, using a barplot, we’ll make a Scree plot with the principal components indicating the explained variance.

## How to create a scree plot in R

Here, we will load the USArrests dataset to plot a scree plot in R. You can click here if you don’t know how to remove it. See the code example below

# Load data
data("USArrests")
dat <- USArrests

# View a data
head(dat)

### Compute Principal Component Analysis using prcomp() function

Principal Component Analysis (PCA) is a statistical process that converts a set of correlated variables to a set of uncorrelated variables using an orthogonal transformation. Scale=TRUE normalizes the data.

# Load data
data("USArrests")
dat <- USArrests

# Compute PCA
PCA <- prcomp(dat, scale = TRUE)

# View a PCA
PCA

### Compute variance explained by each Principal Component

Now, we use the algorithm below to calculate the overall variance encountered by each PC.

# Load data
data("USArrests")
dat <- USArrests

# Compute PCA
PCA <- prcomp(dat, scale = TRUE)

attach(PCA)

# Compute Total Variance
var <- sdev^2 / sum(sdev^2)
var

### Plot Scree plot with Line plot

Here, we will start to plot a scree plot in R using the ggplot2 package. So, we will install and load the ggplot2 package. You can click this link to know how to install and load the package.

# Load package
library(ggplot2)

# Remove the species column
numIris <- subset(iris,
select = -c(Species)
)

# Compute PCA
PCA <- prcomp(numIris, scale = TRUE)

attach(PCA)

# Compute Total Variance
var <- sdev^2 / sum(sdev^2)

# Scree plot with line plot
qplot(c(1:4), var) +
geom_point(size = 4) +
geom_line() +
ylim(0, 1) +
ggtitle("Scree Plot") +
xlab("Principal Component") +
ylab("Variance Explained")

Output

### Plot Scree plot with barplot

We can plot a scree plot with a barplot as follows:

# Load package
library(ggplot2)

# Remove the species column
numIris = subset(iris,
select = -c(Species))

# Compute PCA
PCA <- prcomp(numIris, scale = TRUE)

attach(PCA)

# Compute Total Variance
var = sdev^2 / sum(sdev^2)

# Scree plot with barplot
qplot(c(1:4), var) +
ylim(0, 0.8) +
geom_col() +
ggtitle("Scree Plot") +
xlab("Principal Component") +
ylab("Variance Explained")

Output

## Summary

This article will demonstrate how to plot a scree plot in R with the line plot and barplot. So, please comment if you have any questions.

Have a great day!

Posted in R