How To Use The Probability Density Function In R

The Probability Density Function In R

R supports many probability distributions out of the box. This tutorial will guide you through some of them to give you a quick idea about those distributions, including the probability density function in R.

Probability Density Function In R

For each probability distribution, R provides 4 associated functions:

  • the density function, whose name always starts with ‘d’ (such as dnorm)
  • the cumulative distribution function, whose name always starts with ‘p’ (such sa pdorm)
  • the function that generates random variables – its name always starts with ‘r’ (such as rdorm)
  • the inverse cumulative distribution function, whose name always starts with ‘q’ (such as qnorm)

Normal Distribution

In R, the dnorm, pnorm, rnorm, and qnorm functions represent the normal distribution. Their syntax:

dnorm(x, mean, sd, log)

rnorm(n, mean, sd)

pnorm(q, mean, sd, lower.tail, log.p)

qnorm(p, mean, sd, lower.tail, log.p)

In which:

  • x and q are the vectors of quantiles.
  • n is the vector of observations.
  • p is the vector of probabilities.
  • mean is the vector of means.
  • sd is the vector of standard deviations.
  • By default, lower.tail is TRUE, meaning probabilities are P[X≤x]P[X≤x]. If you set it to FALSE, probabilities are P[X>x].
  • Set log and log.p to TRUE if you want to use log(p) as probabilities.

If you don’t provide sd or mean, R will assume they are 1 or 0, respectively. For example, this is how you can look up P(X < 45) with mean 30 and standard deviation 10.

pnorm(45, mean = 30, sd = 10)
[1] 0.9331928

Let’s say we want to demonstrate the standard normal distribution between -4 and 4. First, we create two sequences to contain points between this range. Then we use three data frames to hold the outputs of those functions.

z <-seq (-4,4,0.1)
q <-seq (0.01,0.99,0.01)
PDF <- data.frame(
  Z=z,
  Density=dnorm(z),
  Distribution=pnorm(z))
CDF <- data.frame(
  Z=z,
  Distribution=pnorm(z))
QF <- data.frame(
  Q=q,
  Quantile=qnorm(q))

Note: read this guide to learn more about the seq() function.

You can have a look at their first values with the head() function:

head(PDF)

head(CDF)

head(QF)

To plot these functions, you can use the ggplot() function from the ggplot2 package.

ggplot(data.frame(z = c(-4, 4)), aes(x = z)) +
        stat_function(fun = dnorm)
ggplot(data.frame(z = c(-4, 4)), aes(x = z)) +
  stat_function(fun = pnorm)

Other Distributions

In addition to the normal distribution, you can also make your analysis with other distributions, such as:

  • Binomial: dbinom(), pbinom(), qbinom(), rbinom()
  • Gamma: dgamma(), pgamma(), qgamma(), rgamma()
  • Poisson: dpois(), ppois(), gois(), rpois()
  • Welbull: dweibull(), pweibull(), qweibull(), rwilcox()
  • Uniform: dunif(), punif(), quni(), runi()

For example, here is a similar program for the Poisson distribution:

events <- 0:50
density <- dpois(x = events, lambda = 4)
prob <- ppois(q = events, lambda = 4, lower.tail = TRUE)
df <- data.frame(events, density, prob)
ggplot(df, aes(x = factor(events), y = density)) +
  geom_col() +
  geom_line(data = df, aes(x = events, y = prob))

Summary

You can almost always find the probability density function in R for the distribution you are using. Remember that there are four functions for each of them, and you should consult the documentation to know how those functions work.

Maybe you are interested:

Leave a Reply

Your email address will not be published. Required fields are marked *