|
In mathematics, a Markov chain is a stochastic process with the Markov property. In such a process, the distant past is irrelevant given knowledge of the recent past.
In the discrete-time case, the process consists of a sequence X1, X2, X3, ... of
random variables. The domain of these variables is called the
state space, with the value of Xn being the state at time n. If the conditional
distribution of Xn+1 on past states is a function of Xn alone,
-
then the process is said to have the Markov property.
Markov chains are named after A.A. Markov, who
produced the first results (1906) for these processes. A generalization to countably infinite state spaces was given by Kolmogorov (1936). Markov chains are related to
Brownian motion and the ergodic hypothesis, two topics in physics which were important in the early years of the twentieth century, but Markov appears to have pursued this out of a
mathematical motivation, namely the extension of the law of
large numbers to dependent events.
Properties of Markov chains
A Markov chain is characterized by the conditional distribution
-
which is called the transition probability of the process. This is sometimes called the "one-step" transition
probability. The probability of a transition in two, three, or more steps is derived from the one-step transition probability and
the Markov property:
-
Likewise,
-
These formulas generalize to arbitrary future times n+k by multiplying the transition probabilities and
integrating k times.
The marginal distribution
P(Xn) is the distribution over states at time n. The initial distribution is
P(X0). The evolution of the process through one time step is described by
-
This is a version of the Frobenius-Perron equation. There may exist one or more state distributions π such that
-
where Y is just a convenient name for the variable of integration. Such a distribution π is called a
stationary distribution or steady-state distribution. A stationary distribution is an eigenfunction of the conditional distribution function, associated with the
eigenvalue 1.
Whether or not there is a stationary distribution, and whether or not it is unique if it does exist, are determined by certain
properties of the process. Irreducible means that every state is accessible from every other state. Aperiodic
means that there exists at least one state for which the transition from that state to itself is possible. Positive
recurrent means that the expected return time is finite for every state. Sometimes the terms indecomposable,
acyclic, and persistent are used as synonyms for "irreducible", "aperiodic", and "recurrent", respectively.
If the Markov chain is positive recurrent, there exists a stationary distribution. If it is positive recurrent and
irreducible, there exists a unique stationary distribution, and furthermore the process constructed by taking the stationary
distribution as the initial distribution is ergodic. Then the average
of a function f over samples of the Markov chain is equal to the average with respect to the stationary
distribution,
-
In particular, this holds for f equal to the identity function. Thus the average of sample values over time is equal
to the expected value of the stationary distribution.
Furthermore, the equivalence of averages also holds if f is the indicator function on some subset A of the state space.
-
where μπ is the measure induced by π. This makes it possible to approximate the stationary
distribution by a histogram or other density estimate of a sequence of
samples.
Markov chains in discrete state spaces
If the state space is finite, the transition probability distribution can be
represented as a matrix, called the transition matrix, with the (i, j)'th element equal to
-
(In this formulation, element (i, j) is the probability of a transition from j to i. An
equivalent formulation is sometimes given with element (i, j) equal to the probability of a transition from
i to j. In that case the transition matrix is just the transpose of the one given here.)
For a discrete state space, the integrations in the k-step transition probability are summations, and can be computed
as the k'th power of the transition matrix. That is, if P is the one-step transition matrix, then
Pk is the transition matrix for the k-step transition.
Writing P for the transition matrix, a stationary distribution is a vector which satisfies the equation
-
In this case, the stationary distribution is an eigenvector of the
transition matrix, associated with the eigenvalue 1. If the transition matrix
P is positive recurrent, irreducible, and aperiodic, then Pk converges elementwise to a
matrix in which each column is the unique stationary distribution.
A transition matrix which is positive (that is, every element of the matrix is positive) is irreducible, aperiodic, and
positive recurrent. A matrix is a stochastic matrix if and only
if it is the matrix of transition probabilities of some Markov chain.
Scientific applications
Markov chains are used to model various processes in queueing
theory and statistics, and can also be used as a signal model in entropy coding techniques such as arithmetic coding. Markov chains also have many biological applications, particularly population processes, which are useful in modelling processes that are
(at least) analogous to biological populations. Markov chains have been used in bioinformatics as well. An example is the genemark algorithm for coding region/gene prediction.
Markov processes can also be used to generate superficially "real-looking" text given a sample document: they are used in
various pieces of recreational "parody generator" software (see Jeff
Harrison).
See also
References
- A.A. Markov. "Rasprostranenie zakona bol'shih chisel na velichiny, zavisyaschie drug ot druga". Izvestiya
Fiziko-matematicheskogo obschestva pri Kazanskom universitete, 2-ya seriya, tom 15, pp 135-156, 1906.
- A.A. Markov. "Extension of the limit theorems of probability theory to a sum of variables connected in a chain". reprinted in
Appendix B of: R. Howard. Dynamic Probabilistic Systems, volume 1: Markov Chains. John Wiley and Sons, 1971.
- Leo Breiman. Probability. Original edition published by Addison-Wesley, 1968; reprinted by Society for Industrial
and Applied Mathematics, 1992. ISBN
0-89871-296-3. (See Chapter 7.)
- J.L. Doob. Stochastic Processes. New York: John Wiley and Sons, 1953. ISBN 0-471-52369-0.
External links
|