|
In probability theory, several laws of large
numbers say that the average of a sequence of random variables with a common distribution converges (in the senses given below) to their common expectation, in the limit as the size of the sequence goes to infinity.
Various formulations of the law of large numbers, and their associated conditions, specify convergence in different ways.
In a statistical context, laws of large numbers imply that the average of a random sample from a large population is likely to be close to the mean of the whole population.
When the random variables have a finite variance, the central limit theorem, which extends our understanding of the convergence of their average by
describing distribution of the standardised difference between the sum of the random variables and the expectation of this sum.
Regardless of the underlying distribution of the random variables, this standardised difference converges in distribution to a standard Normal random
variable.
The weak law
The weak law of large numbers states that if X1, X2,
X3, ... is an infinite sequence of random variables, all of which have the same expected value μ and the same finite variance
σ2, and they are uncorrelated (i.e., the correlation between any two of them is zero), then the sample average
-
converges in probability to
μ. Somewhat less tersely: For any positive number ε, no matter how small, we have
-
Chebyshev's inequality is used to prove this
result.
A consequence of the weak law of large numbers is the asymptotic equipartition property.
The strong law
The strong law of large numbers states that if X1, X2,
X3, ... is an infinite sequence of random variables that are independent and identically distributed with common expected value μ, and if
E(|X1|) < ∞, then
-
i.e., the sample average converges
almost surely to μ.
If we replace the finite expectation condition with a finite second moment condition,
E(X12) < ∞, then we obtain both almost sure convergence and convergence in mean square. In either case,
these conditions also imply the consequent of the weak law of large numbers, since almost sure convergence implies convergence in
probability (as, indeed, does convergence in mean square).
This law justifies the intuitive interpretation of the expected value of a random variable as the "long-term average when
sampling repeatedly".
A weaker law and proof
Proofs of the above weak and strong laws of large numbers are rather involved. The consequent of the slightly weaker form
below is implied by the weak law above (since convergence in distribution is implied by convergence in probability), but has a
simpler proof.
Theorem. Let X1, X2, X3, ... be a sequence of
random variables, independent and identically distributed with common mean μ < ∞, and define the partial sum
Sn := X1 + X2 + ... +Xn.
Then, Sn / n converges in distribution to μ.
Proof. (See [1], p. 174) By Taylor's theorem for complex functions, the
characteristic function of any random variable,
X, with finite mean μ, can be written as
-
Then, since the characteristic function of the sum of random variables is the product of their characteristic functions, the
characteristic function of Sn / n is
-
The limit eitμ is the characteristic function of the constant random variable
μ, and hence by the Lévy continuity theorem, Sn / n converges in
distribution to μ. Note that the proof of the central limit theorem, which tells us more about the convergence of the average to
μ (when the variance σ 2 is finite), follows a very similar approach.
References
- G.R. Grimmett and D.R. Stirzaker (1992). Probability and Random Processes 2nd Edition. Clarendon Press, Oxford.
|