Probability Distribution Functions (PMF, PDF, CDF)
Table of Contents
Introduction
This tutorial provides a comprehensive overview of Probability Distribution Functions, including Probability Mass Function (PMF), Probability Density Function (PDF), and Cumulative Distribution Function (CDF). Understanding these concepts is essential for statistical analysis and data interpretation in various fields, including economics, engineering, and social sciences.
Step 1: Understanding Terminology
Before diving into the specific functions, it’s important to grasp some key terms:
- Discrete Variable: A variable that can take on a countable number of distinct values.
- Continuous Variable: A variable that can take on an infinite number of values within a given range.
Step 2: Exploring Probability Mass Function (PMF)
The PMF is used for discrete random variables. It provides the probability that a discrete random variable is exactly equal to some value.
Key Points:
- The PMF must satisfy two conditions:
- The sum of probabilities for all possible outcomes must equal 1.
- Each probability must be between 0 and 1 inclusive.
Example:
If you have a six-sided die, the PMF can be represented as:
- P(X = 1) = 1/6
- P(X = 2) = 1/6
- P(X = 3) = 1/6
- P(X = 4) = 1/6
- P(X = 5) = 1/6
- P(X = 6) = 1/6
Step 3: Understanding Cumulative Distribution Function (CDF) - Discrete
The CDF of a discrete random variable gives the probability that the variable takes on a value less than or equal to a specific value.
How to Compute CDF:
- Identify all possible values of the discrete random variable.
- Calculate the PMF for each value.
- Compute the cumulative probabilities.
Example:
For a die:
- CDF(1) = P(X ≤ 1) = 1/6
- CDF(2) = P(X ≤ 2) = 1/6 + 1/6 = 2/6
- CDF(3) = P(X ≤ 3) = 3/6
- And so on...
Step 4: Exploring Probability Density Function (PDF)
The PDF is used for continuous random variables. It describes the likelihood of a random variable to take on a particular value.
Key Points:
- The area under the PDF curve over an interval represents the probability of the variable falling within that interval.
- The total area under the curve equals 1.
Example:
For a normal distribution, the PDF can be expressed with the formula:
f(x) = (1 / (σ√(2π))) * e^(-0.5 * ((x - μ)/σ)²)
where:
- μ = mean
- σ = standard deviation
Step 5: Understanding Cumulative Distribution Function (CDF) - Continuous
The CDF for continuous variables provides the probability that the variable will take a value less than or equal to x.
How to Compute CDF:
- Integrate the PDF from negative infinity to x.
- The result gives the probability that the random variable is less than or equal to x.
Example:
If you have a normal distribution with mean µ and standard deviation σ, the CDF can be calculated using:
F(x) = ∫(from -∞ to x) f(t) dt
Conclusion
In this tutorial, we covered the fundamental concepts of PMF, PDF, and CDF, including their definitions, computations, and examples. Understanding these functions is crucial for analyzing data and making informed decisions based on probability. For further practice, consider applying these concepts to real-world data sets or exploring advanced topics like joint distributions and their applications in statistical modeling.