Law of total variance: Difference between revisions

Latest revision as of 22:37, 26 February 2025

The law of total variance is a fundamental result in probability theory that expresses the variance of a random variable Template:Mvar in terms of its conditional variances and conditional means given another random variable Template:Mvar. Informally, it states that the overall variability of Template:Mvar can be split into an “unexplained” component (the average of within-group variances) and an “explained” component (the variance of group means).

Formally, if Template:Mvar and Template:Mvar are random variables on the same probability space, and Template:Mvar has finite variance, then:

$Var (Y) = E [Var (Y ∣ X)] + Var (E [Y ∣ X]) .$

This identity is also known as the variance decomposition formula, the conditional variance formula, the law of iterated variances, or colloquially as Eve’s law,^[1] in parallel to the “Adam’s law” naming for the law of total expectation.

In actuarial science (particularly in credibility theory), the two terms $E [Var (Y ∣ X)]$ and $Var (E [Y ∣ X])$ are called the expected value of the process variance (EVPV) and the variance of the hypothetical means (VHM) respectively.^[2]

Explanation

Let Template:Mvar be a random variable and Template:Mvar another random variable on the same probability space. The law of total variance can be understood by noting:

$Var (Y ∣ X)$ measures how much Template:Mvar varies around its conditional mean $E [Y ∣ X] .$
Taking the expectation of this conditional variance across all values of Template:Mvar gives $E [Var (Y ∣ X)]$ , often termed the “unexplained” or within-group part.
The variance of the conditional mean, $Var (E [Y ∣ X])$ , measures how much these conditional means differ (i.e. the “explained” or between-group part).

Adding these components yields the total variance $Var (Y)$ , mirroring how analysis of variance partitions variation.

Examples

Example 1 (Exam Scores)

Suppose five students take an exam scored 0–100. Let Template:Mvar = student’s score and Template:Mvar indicate whether the student is *international* or *domestic*:

Student	Template:Mvar (Score)	Template:Mvar
1	20	International
2	30	International
3	100	International
4	40	Domestic
5	60	Domestic

Mean and variance for international: $E [Y ∣ X = Intl] = 50, Var (Y ∣ X = Intl) \approx 1266.7 .$
Mean and variance for domestic: $E [Y ∣ X = Dom] = 50, Var (Y ∣ X = Dom) = 100 .$

Both groups share the same mean (50), so the explained variance $Var (E [Y ∣ X])$ is 0, and the total variance equals the average of the within-group variances (weighted by group size), i.e. 800.

Example 2 (Mixture of Two Gaussians)

Let Template:Mvar be a coin flip taking values Template:Math with probability Template:Mvar and Template:Math with probability Template:Mvar. Given Heads, Template:Mvar ~ Normal( $μ_{h}, σ_{h}^{2}$ ); given Tails, Template:Mvar ~ Normal( $μ_{t}, σ_{t}^{2}$ ). Then $E [Var (Y ∣ X)] = h σ_{h}^{2} + (1 - h) σ_{t}^{2},$ $Var (E [Y ∣ X]) = h (1 - h) (μ_{h} - μ_{t})^{2},$ so $Var (Y) = h σ_{h}^{2} + (1 - h) σ_{t}^{2} + h (1 - h) (μ_{h} - μ_{t})^{2} .$

Example 3 (Dice and Coins)

Consider a two-stage experiment:

Roll a fair die (values 1–6) to choose one of six biased coins.
Flip that chosen coin; let Template:Mvar=1 if Heads, 0 if Tails.

Then $E [Y ∣ X = i] = p_{i}, Var (Y ∣ X = i) = p_{i} (1 - p_{i}) .$ The overall variance of Template:Mvar becomes $Var (Y) = E [p_{X} (1 - p_{X})] + Var (p_{X}),$ with $p_{X}$ uniform on ${p_{1}, \dots, p_{6}} .$

Proof

Discrete/Finite Proof

Let $(X_{i}, Y_{i})$ , $i = 1, \dots, n$ , be observed pairs. Define $\overline{Y} = E [Y] .$ Then $Var (Y) = \frac{1}{n} \sum_{i = 1}^{n} (Y_{i} - \overline{Y})^{2} = \frac{1}{n} \sum_{i = 1}^{n} [(Y_{i} - {\overline{Y}}_{X_{i}}) + ({\overline{Y}}_{X_{i}} - \overline{Y})]^{2},$ where ${\overline{Y}}_{X_{i}} = E [Y ∣ X = X_{i}] .$ Expanding the square and noting the cross term cancels in summation yields: $Var (Y) = E [Var (Y ∣ X)] + Var (E [Y ∣ X]) .$

General Case

Using $Var (Y) = E [Y^{2}] - E [Y]^{2}$ and the law of total expectation: $E [Y^{2}] = E [E (Y^{2} ∣ X)] = E [Var (Y ∣ X) + E [Y ∣ X]^{2}] .$ Subtract $E [Y]^{2} = (E [E (Y ∣ X)])^{2}$ and regroup to arrive at $Var (Y) = E [Var (Y ∣ X)] + Var (E [Y ∣ X]) .$

Applications

Analysis of Variance (ANOVA)

In a one-way analysis of variance, the total sum of squares (proportional to $Var (Y)$ ) is split into a “between-group” sum of squares ( $Var (E [Y ∣ X])$ ) plus a “within-group” sum of squares ( $E [Var (Y ∣ X)]$ ). The F-test examines whether the explained component is sufficiently large to indicate Template:Mvar has a significant effect on Template:Mvar.^[3]

Regression and R²

In linear regression and related models, if $\hat{Y} = E [Y ∣ X],$ the fraction of variance explained is $R^{2} = \frac{Var (\hat{Y})}{Var (Y)} = \frac{Var (E [Y ∣ X])}{Var (Y)} = 1 - \frac{E [Var (Y ∣ X)]}{Var (Y)} .$ In the simple linear case (one predictor), $R^{2}$ also equals the square of the Pearson correlation coefficient between Template:Mvar and Template:Mvar.

Machine Learning and Bayesian Inference

In many Bayesian and ensemble methods, one decomposes prediction uncertainty via the law of total variance. For a Bayesian neural network with random parameters $θ$ : $Var (Y) = E [Var (Y ∣ θ)] + Var (E [Y ∣ θ]),$ often referred to as “aleatoric” (within-model) vs. “epistemic” (between-model) uncertainty.^[4]

Actuarial Science

Credibility theory uses the same partitioning: the expected value of process variance (EVPV), $E [Var (Y ∣ X)],$ and the variance of hypothetical means (VHM), $Var (E [Y ∣ X]) .$ The ratio of explained to total variance determines how much “credibility” to give to individual risk classifications.^[2]

Information Theory

For jointly Gaussian $(X, Y)$ , the fraction $Var (E [Y ∣ X]) / Var (Y)$ relates directly to the mutual information $I (Y; X) .$ ^[5] In non-Gaussian settings, a high explained-variance ratio still indicates significant information about Template:Mvar contained in Template:Mvar.

Generalizations

The law of total variance generalizes to multiple or nested conditionings. For example, with two conditioning variables $X_{1}$ and $X_{2}$ : $Var (Y) = E [Var (Y ∣ X_{1}, X_{2})] + E [Var (E [Y ∣ X_{1}, X_{2}] ∣ X_{1})] + Var (E [Y ∣ X_{1}]) .$ More generally, the law of total cumulance extends this approach to higher moments.

References

Template:Reflist

↑ Joe Blitzstein and Jessica Hwang, Introduction to Probability, Final Review Notes.
↑ ^2.0 ^2.1 Template:Cite book
↑ Analysis of variance — R.A. Fisher’s 1920s development.
↑ See for instance AWS ML quantifying uncertainty guidance.
↑ C. G. Bowsher & P. S. Swain (2012). "Identifying sources of variation and the flow of information in biochemical networks," PNAS 109 (20): E1320–E1328.

[1] Joe Blitzstein and Jessica Hwang, Introduction to Probability, Final Review Notes.

[FCAS4ed-2] 2.0 ^2.1 Template:Cite book

[3] Analysis of variance — R.A. Fisher’s 1920s development.

[4] See for instance AWS ML quantifying uncertainty guidance.

[5] C. G. Bowsher & P. S. Swain (2012). "Identifying sources of variation and the flow of information in biochemical networks," PNAS 109 (20): E1320–E1328.

[1]

[2]

[3]

[4]

[5]

Law of total variance: Difference between revisions

Latest revision as of 22:37, 26 February 2025

Contents

Explanation

Examples

Example 1 (Exam Scores)

Example 2 (Mixture of Two Gaussians)

Example 3 (Dice and Coins)

Proof

Discrete/Finite Proof

General Case

Applications

Analysis of Variance (ANOVA)

Regression and R²

Machine Learning and Bayesian Inference

Actuarial Science

Information Theory

Generalizations

See also

References

Navigation menu

Law of total variance: Difference between revisions

Latest revision as of 22:37, 26 February 2025

Explanation

Examples

Example 1 (Exam Scores)

Example 2 (Mixture of Two Gaussians)

Example 3 (Dice and Coins)

Proof

Discrete/Finite Proof

General Case

Applications

Analysis of Variance (ANOVA)

Regression and R²

Machine Learning and Bayesian Inference

Actuarial Science

Information Theory

Generalizations

See also

References

Navigation menu

Search