This tutorial provides a simple explanation of the difference between a PDF (probability density function) and a CDF (cumulative distribution function) in statistics.
Random Variables
Before we can define a PDF or a CDF, we first need to understand random variables.
Arandom variable, usually denoted as X, is a variable whose values are numerical outcomes of some random process. There are two types of random variables: discrete and continuous.
Discrete Random Variables
Adiscrete random variableis one which can take on only a countable number of distinct values like 0, 1, 2, 3, 4, 5…100, 1 million, etc. Some examples of discrete random variables include:
- The number of times a coin lands on tails after being flipped 20 times.
- The number of times a dice lands on the number4after being rolled 100 times.
Continuous Random Variables
Acontinuous random variableis one which can take on an infinite number of possible values. Some examples of continuous random variables include:
- Height of a person
- Weight of an animal
- Time required to run a mile
For example, the height of a person could be 60.2 inches, 65.2344 inches, 70.431222 inches, etc. There are an infinite amount of possible values for height.
Rule of Thumb:If you cancount the number of outcomes, then you are working with a discrete random variable(e.g. counting the number of times a coin lands on heads). But if you canmeasurethe outcome, you are working with a continuous random variable (e.g. measuring, height, weight, time, etc.)
Probability Density Functions
Aprobability density function(pdf)tells us the probability that a random variable takes on a certain value.
For example, suppose we roll a dice one time. If we let xdenote the number that the dice lands on, then the probability density function for the outcome can be described as follows:
P(x < 1) : 0
P(x = 1) : 1/6
P(x = 2) : 1/6
P(x = 3) : 1/6
P(x = 4) : 1/6
P(x = 5) : 1/6
P(x = 6) : 1/6
P(x > 6) : 0
Note that this is an example of a discrete random variable, sincexcan only take on integer values.
For a continuous random variable, we cannot use a PDF directly, since the probability thatxtakes on any exact value is zero.
For example, suppose we want to know the probability that a burger from a particular restaurant weighs a quarter-pound (0.25 lbs). Sinceweightis a continuous variable, it can take on an infinite number of values.
For example, a given burger might actually weight 0.250001 pounds, or 0.24 pounds, or 0.2488 pounds. The probability that a given burger weights exactly .25 pounds is essentially zero.
Cumulative Distribution Functions
Acumulative distribution function (cdf)tells us the probability that a random variable takes on a value less than or equal tox.
For example, suppose we roll a dice one time. If we let xdenote the number that the dice lands on, then the cumulative distribution function for the outcome can be described as follows:
P(x ≤ 0) : 0
P(x ≤ 1) : 1/6
P(x ≤ 2) : 2/6
P(x ≤ 3) : 3/6
P(x ≤ 4) : 4/6
P(x ≤ 5) : 5/6
P(x ≤ 6) : 6/6
P(x > 6) : 0
Notice that the probability thatxis less than or equal to6is 6/6, which is equal to 1. This is because the dice will land on either 1, 2, 3, 4, 5, or 6 with 100% probability.
This example uses a discrete random variable, but a continuous density function can also be used for a continuous random variable.
Cumulative distribution functions have the following properties:
- The probability that a random variable takes on a value less than the smallest possible value is zero. For example, the probability that a dice lands on a value less than 1 is zero.
- The probability that a random variable takes on a value less than or equal to the largest possible value is one. For example, the probability that a dice lands on a value of 1, 2, 3, 4, 5, or 6 is one. It must land on one of those numbers.
- The cdf is always non-decreasing. That is, the probability that a dice lands on a number less than or equal to 1 is 1/6, the probability that it lands on a number less than or equal to 2 is 2/6,the probability that it lands on a number less than or equal to 3 is 3/6, etc. The cumulative probabilities are always non-decreasing.
Related:You can use an ogive graph to visualize a cumulative distribution function.
The Relationship Between a CDF and a PDF
In technical terms, a probability density function (pdf) is the derivative of a cumulative distribution function (cdf).
Furthermore, the area under the curve of a pdf between negative infinity and xis equal to the value ofxon the cdf.
For an in-depth explanation of the relationship between a pdf and a cdf, along with the proof for why the pdf is the derivative of the cdf, refer to a statistical textbook.
Featured Posts
5 Tips for Interpreting P-Values Correctly in Hypothesis Testing
May 23, 2024
7 Best YouTube Channels to Learn Statistics for Free
May 20, 2024
5 Regularization Techniques You Should Know
May 13, 2024
Statistics Cheat Sheets to Get Before Your Job Interview
May 6, 2024
5 Statistical Biases to Avoid
April 25, 2024
5 Free Statistics Courses for Beginners
April 19, 2024