Basic Terminology
- Proposition: A statement or claim that is definitively either true or false.
- Probability Distribution: A function that describes the probability of a random variable taking a certain value or falling within a certain range.
- Event: A possible outcome of a probabilistic trial.
- Random Variable: A variable whose value is determined probabilistically by the outcome of a trial.
Basic Laws of Probability
Joint Probability
The probability that multiple events occur simultaneously.
$$ p(x, y) $$ $p(x, y)$ represents the probability that $x$ and $y$ occur together. The order does not matter ($p(x, y) = p(y, x)$).
Marginal Probability
When there are multiple random variables, the marginal probability focuses on a specific variable only. It is obtained by marginalizing (summing or integrating) over the unwanted variables from the joint probability.
$$ p(x) = \sum_y p(x, y) \quad (\text{for discrete variables}) $$ $$ p(x) = \int p(x, y) dy \quad (\text{for continuous variables}) $$
Conditional Probability
The probability of one event occurring given that another event has occurred.
$$ p(y|x) = \frac{p(x, y)}{p(x)} $$ This represents “the probability of $y$ occurring given that $x$ has occurred.”
Independence of Random Variables
Independence
When two random variables $X$ and $Y$ do not influence each other, they are said to be independent. When independent, the joint probability equals the product of the marginal probabilities.
$$ p(x, y) = p(x)p(y) $$ When they are not independent, they are said to be dependent.
Conditional Independence
This refers to the case where two random variables $X$ and $Y$ are independent given another random variable $Z$.
$$ p(x, y | z) = p(x | z) p(y | z) $$ This means that “given $Z=z$, $X$ and $Y$ are independent.”
Distributions of Continuous Variables
For random variables that take continuous values (e.g., height, time), the probability at any single point is zero. Therefore, probabilities are handled using cumulative distribution functions and probability density functions.
Cumulative Distribution Function (CDF)
A function that represents the probability that a random variable $X$ takes a value less than or equal to $x$.
$$ F(x) = P(X \le x) $$
Probability Density Function (PDF)
A function that describes the probability distribution of a continuous random variable. The PDF itself is not a probability, but integrating the PDF over an interval gives the probability of the random variable falling within that interval.
$$ P(a \le X \le b) = \int_a^b f(x) dx $$ The PDF $f(x)$ is defined as the derivative of the cumulative distribution function $F(x)$.
Various Probability Distributions
Exponential Distribution
A distribution that represents the time interval until the next event in a process where events occur randomly. It is characterized by the memoryless property (the probability of future occurrence is constant regardless of elapsed time).
$$ p(x | \lambda) = \lambda \exp(-\lambda x) \quad (x \ge 0) $$ Here, $\lambda > 0$ is the rate parameter.
Laplace Distribution
A distribution that decreases exponentially around the mean. It has a sharper peak and heavier tails (fat tails) compared to the normal distribution.
$$ p(x | \mu, b) = \frac{1}{2b} \exp\lbrace-\frac{|x - \mu|}{b}\rbrace $$ Here, $\mu$ is the location parameter and $b > 0$ is the scale parameter.
Expectation
When a random variable $X$ follows a probability distribution $p(x)$, the expectation of a function $f(X)$ represents the “average value” of that function.
$$ \mathbb{E}[f(X)] = \sum_x f(x)p(x) \quad (\text{for discrete variables}) $$ $$ \mathbb{E}[f(X)] = \int f(x)p(x)dx \quad (\text{for continuous variables}) $$
References
- Taro Tezuka, “Understanding Bayesian Statistics and Machine Learning,” Kodansha (2017)