|
This article is about mathematics. See also variance
(land use).
In mathematics, the variance of a
real-valued random
variable is its second central moment, and also its
second cumulant (cumulants differ from central moments only at and above degree 4).
If μ = E(X) is the expected value of the random variable
X, then the variance is
- σ2 = E((X - μ)2),
i.e., it is the expected value of the square of the deviation of X from its own mean. It is the mean squared
deviation. Note that many distributions, such as the Cauchy
distribution, do not have a variance because the relevant integral diverges.
If the variance is defined, we can conclude two things:
- The variance is never negative because the squares are positive or zero. When any method of calculating the variance results
in a negative number, we know that there has been an error, often due to a poor choice of algorithm.
- The unit of variance is the square of the unit of observation. Thus, the variance of a set of heights measured in centimeters
will be given in square centimeters. This fact is inconvenient and has motivated statisticians to call the square root of the
variance, the standard deviation and to quote this value as a
summary of dispersion.
One reason for the use of the variance in preference to other measures of dispersion is that the variance of the sum of independent random variables is the sum of their variances. (A weaker condition than independence, called
"uncorrelatedness" also suffices.)
For random samples xi where i=1, 2, ..., the variance σ2 is
-
If X is a vector-valued random variable, with
values in Rn, and thought of as a column vector, then the natural generalization of variance is
E((X-μ)(X-μ)'), where μ=E(X) and X' is the transpose of X, and so is a
row vector. This variance is a nonnegative-definite square matrix, commonly referred to as the covariance matrix.
If X is a complex-valued random variable, then its variance is E((X-μ)(X-μ)*),
where X* is the complex conjugate of X. This variance is a nonnegative real number.
When the set of data is a population, we call this
the population variance. If the set is a sample, we
call it the sample variance. When estimating the population variance of a finite sample, the following formula gives an
unbiased estimate:
-
See algorithms for calculating
variance.
See also: standard deviation, arithmetic mean, skewness,
kurtosis, statistical dispersion
|