|
A random variable can be thought of as the numeric result of operating a non-deterministic mechanism or
performing a non-deterministic experiment to generate a random result. For example,
rolling a die and recording the outcome yields a random variable with range { 1, 2, 3, 4, 5, 6 }. Picking a random person and
measuring their height yields another random variable.
Mathematically, a random variable is defined as a measurable
function from a probability space to some measurable space. This measurable space is the space of possible values of
the variable, and it is usually taken to be the real numbers with the
Borel σ-algebra, and we will always assume this in this
encyclopedia, unless otherwise specified.
Distribution functions
If a random variable X:Ω->R defined on the probability space (Ω, P) is given,
we can ask questions like "How likely is it that the value of X is bigger than 2?". This is the same as the probability
of the event {s in Ω : X(s) > 2} which is often written as P(X > 2)
for short.
Recording all these probabilities of output ranges of a real-valued random variable X yields the probability distribution of X. The probability
distribution "forgets" about the particular probability space used to define X and only records the probabilities of
various values of X. Such a probability distribution can always be captured by its cumulative distribution function
- FX(x) = P(X≤x)
and sometimes also using a probability
density function. In measure-theoretic terms, we use the random
variable X to "push-forward" the measure P on Ω to a measure dF on R. The
underlying probability space Ω is a technical device used to guarantee the existence of random variables, and sometimes to
construct them. In practice, one often disposes of the space Ω altogether and just puts a measure on R that
assigns measure 1 to the whole real line, i.e., one works with probability distributions instead of random variables.
Functions of random variables
If we have a random variable X on Ω and a measurable function f:R->R, then
Y=f(X) will also be a random variable on Ω, since the composition of measurable functions is
measurable. The same procedure that allowed one to go from a probability space (Ω,P) to
(R,dFX) can be used to obtain the probability distribution of Y. The cumulative
distribution function of Y is
- FY(y) = Prob(f(X)≤y).
Example
Let X be a real-valued random variable and let Y = X2. Then,
- FY(y) = Prob(X2≤y).
If y<0, then Prob(X2≤y)=0, so
- FY(y) = 0 if y<0.
If y≥0, then
Prob(X2≤y)=Prob(|X|≤√y)=Prob(-√y≤X≤√
y), so
- FY(y) = FX(√y)-FX(-√y) if
y≥0.
Moments
The probability distribution of random variable is often characterised by a small number of parameters, which also have a
practical interpretation. For example, it is often enough to know what its "average value" is. This is captured by the
mathematical concept of expected value of a random variable, denoted
E[X]. Note that in general, E[f(X)] is not the same as f(E[X]).
Once the "average value" is known, one could then ask how far from this average value the values of X typically are, a
question that is answered by the variance and standard deviation of a random variable.
Mathematically, this is known as the (generalised) problem of moments: for a given class of random variables X, find a collection
{fi} of functions such that the expectation values E[fi(X)] fully characterize
the distribution of the random variable X.
Equivalence of random variables
There are saveral different senses in which random variables can be considered to be equivalent. Two random variables can be
equal, equal almost surely, equal in mean, or equal in distribution.
In increasing order of strength, the precise definition of these notions of equivalence is:
Equality in distribution
Two random variables X and Y are equal in distribution if
-
To be equal in distribution, random variables need not be defined on the same probability space, but without loss of
generality they can be made into independent random variables on the same probability space. The notion of equivalence in
distribution is associated to the following notion of distance between probability distributions,
-
which is the basis of the Kolmogorov-Smirnov
test.
Equality in mean
Two random variables X and Y are equal in p-th mean if the pth moment of
|X-Y| is zero, that is,
- E[ | X - Y | p] = 0.
Equality in pth mean implies equality in qth mean for all q<p. As in the previous case,
there is a related distance between the random variables, namely
- dp(X,Y) = E[ | X - Y |
p].
Almost sure equality
Two random variables X and Y are equal almost surely if, and only if, the probability that they are
different is zero:
-
For all practical purposes in probability theory, this notion of equivalence is as strong as actuall equality. It is
associated to the following distance:
-
where 'sup' in this case represents the essential supremum in the sense of measure
theory.
Equality
Finally, two random variables X and Y are equal if they are equal as functions on their probability
space, that is,
-
Convergence
Much of mathematical statistics consists in proving convergence results for certain sequences of random variables; see for instance the law of large numbers and the central limit theorem.
There are various senses in which a sequence (Xn) of random variables can converge to a random
variable X. These are explained in the article on convergence of random variables.
Examples
The following are examples of random integers i, 1 <= i <= 100:
17 12 17 89 64 4 62 6 82 80 61 100 19 7 35 4 23 43 49 69 4 81 64 52 33 59 56 56 46 25 2 44 34 73 58 48 94 18 65 47 73 16 69 26
26 65 35 65 64 2 59 36 52 77 52 14 79 42 71 82 60 28 72 96 77 72 78 58 71 44 99 41 41 80 53 67 7 66 49 86 94 85 47 27 1 6 86 50
32 26 60 79 94 53 72 98 78 46 73 50 49 3 77 57 56 23 20 70 1 58 42 72 16 84 96 44 42 76 19 71 57 17 34 66 68 63 100 37 38 68 52
52 42 86 15 53 76 59 43 94 67 21 74 73 85 16 12 45 57 7 4 22 23 74 15 63 80 65 76 88 39 39 100 96 85 64 16 55 62 50 71 27 48 95
96 30 65 33 71 50 39 1 70 99 55 74 2 74 98 48 99 90 28 66 41 17 80 35 8 30 85 41 68 18 46 86 91 40 20 43 71 95 48 40 79 88 77 49
81 52 15 8 11 51 26 99 8 28 37 47 37 17 30 27 39 33 65 8 31 73 48 96 41 78 9 89 72 16 61 48 73 90 39 34 7 41 1 87 48 83 41 64 61
47 71 2 35 66 74 29 74 7 61 22 46 46 4 59 23 79 33 7 31 41 54 63 91 81 58 66 83 24 37 84 16 55 9 52 92 69 44 27 57 38 70 37 33 23
24 18 74 20 87 73 28 85 34 31 76 25 6 38 15 73 16 79 83 94 21 52 34 19 66 5 97 33 100 63 36 100 4 63 84 8 21 21 92 60 72 22 25 80
23 8 10 10 63 44 14 86 47 17 45 4 18 21 44 27 88 10 92 90 27 54 73 68 13 15 68 31 4 83 46 97 97 32 12 66 66 87 100 75 99 75 73 16
86 90 66 51 59 80 87 40 35 21 76 65 74 73 26 41 17 67 88 54 42 62 98 78 19 29 60 79 19 76 13 95 68 76 86 47 91 23 25 50 57 27 97
30 16 82 5 7 31 72 64 18 32 100 54 18 51 66 38 74 91 75 41 81 21 32 96 78 90 9 82 21 84 80 65 72 52 17 81 50 1 90 14 45 11 76 91
31 20 93 30 30 66 10 20 37 89 3 71 35 96 82 11 4
See also: discrete random variable, continuous random variable, probability distribution, randomness, random vector, random function, generating function
|