Scree Plot In R: How To Plot A Scree Plot

Today, we will discuss the scree plot in the R programming language. In this article, we will use the ggplot2 package in R to plot the scree plot.

What is the scree plot in R?

We will apply PCA to the USArrests dataset and plot the scree plot. To begin, we will make a scree plot out of line plots, with the principal components and variance explained by each PC as a point connected by a line. Then, using a barplot, we’ll make a Scree plot with the principal components indicating the explained variance.

How to create a scree plot in R

Load dataset

Here, we will load the USArrests dataset to plot a scree plot in R. You can click here if you don’t know how to remove it. See the code example below

# Load data
data("USArrests")
dat <- USArrests

# View a data
head(dat)

Compute Principal Component Analysis using prcomp() function

Principal Component Analysis (PCA) is a statistical process that converts a set of correlated variables to a set of uncorrelated variables using an orthogonal transformation. Scale=TRUE normalizes the data.

# Load data
data("USArrests")
dat <- USArrests

# Compute PCA
PCA <- prcomp(dat, scale = TRUE)

# View a PCA
PCA

Compute variance explained by each Principal Component

Now, we use the algorithm below to calculate the overall variance encountered by each PC.

# Load data
data("USArrests")
dat <- USArrests

# Compute PCA
PCA <- prcomp(dat, scale = TRUE)

attach(PCA)

# Compute Total Variance
var <- sdev^2 / sum(sdev^2)
var

Plot Scree plot with Line plot

Here, we will start to plot a scree plot in R using the ggplot2 package. So, we will install and load the ggplot2 package. You can click this link to know how to install and load the package.

# Load package
library(ggplot2)

# Remove the species column
numIris <- subset(iris,
    select = -c(Species)
)

# Compute PCA
PCA <- prcomp(numIris, scale = TRUE)

attach(PCA)

# Compute Total Variance
var <- sdev^2 / sum(sdev^2)

# Scree plot with line plot
qplot(c(1:4), var) +
    geom_point(size = 4) +
    geom_line() +
    ylim(0, 1) +
    ggtitle("Scree Plot") +
    xlab("Principal Component") +
    ylab("Variance Explained")

Output

Plot Scree plot with barplot

We can plot a scree plot with a barplot as follows:

# Load package
library(ggplot2)

# Remove the species column
numIris = subset(iris,
    select = -c(Species))

# Compute PCA
PCA <- prcomp(numIris, scale = TRUE)

attach(PCA)

# Compute Total Variance
var = sdev^2 / sum(sdev^2)

# Scree plot with barplot
qplot(c(1:4), var) +
    ylim(0, 0.8) +
    geom_col() +
    ggtitle("Scree Plot") +
    xlab("Principal Component") +
    ylab("Variance Explained")

Output

Summary

This article will demonstrate how to plot a scree plot in R with the line plot and barplot. So, please comment if you have any questions.

Have a great day!

Posted in R

Leave a Reply

Your email address will not be published. Required fields are marked *