Today, we will discuss the scree plot in the R programming language. In this article, we will use the ggplot2 package in R to plot the scree plot.

**What is the scree plot in R?**

We will apply PCA to the USArrests dataset and plot the scree plot. To begin, we will make a scree plot out of line plots, with the principal components and variance explained by each PC as a point connected by a line. Then, using a barplot, we’ll make a Scree plot with the principal components indicating the explained variance.

**How to create a scree plot in R**

**Load dataset**

# Load data data("USArrests") dat <- USArrests # View a data head(dat)

**Compute Principal Component Analysis using prcomp() function**

Principal Component Analysis (PCA) is a statistical process that converts a set of correlated variables to a set of uncorrelated variables using an orthogonal transformation. Scale=TRUE normalizes the data.

# Load data data("USArrests") dat <- USArrests # Compute PCA PCA <- prcomp(dat, scale = TRUE) # View a PCA PCA

**Compute variance explained by each Principal Component**

Now, we use the algorithm below to calculate the overall variance encountered by each PC.

# Load data data("USArrests") dat <- USArrests # Compute PCA PCA <- prcomp(dat, scale = TRUE) attach(PCA) # Compute Total Variance var <- sdev^2 / sum(sdev^2) var

**Plot Scree plot with Line plot**

Here, we will start to plot a scree plot in R using the ggplot2 package. So, we will install and load the ggplot2 package. You can click this **link**** **to know how to install and load the package.

# Load package library(ggplot2) # Remove the species column numIris <- subset(iris, select = -c(Species) ) # Compute PCA PCA <- prcomp(numIris, scale = TRUE) attach(PCA) # Compute Total Variance var <- sdev^2 / sum(sdev^2) # Scree plot with line plot qplot(c(1:4), var) + geom_point(size = 4) + geom_line() + ylim(0, 1) + ggtitle("Scree Plot") + xlab("Principal Component") + ylab("Variance Explained")

Output

**Plot Scree plot with barplot**

We can plot a scree plot with a barplot as follows:

# Load package library(ggplot2) # Remove the species column numIris = subset(iris, select = -c(Species)) # Compute PCA PCA <- prcomp(numIris, scale = TRUE) attach(PCA) # Compute Total Variance var = sdev^2 / sum(sdev^2) # Scree plot with barplot qplot(c(1:4), var) + ylim(0, 0.8) + geom_col() + ggtitle("Scree Plot") + xlab("Principal Component") + ylab("Variance Explained")

Output

**Summary**

This article will demonstrate how to plot a scree plot in R with the line plot and barplot. So, please comment if you have any questions.

