|
In statistics and probability theory, the Poisson distribution is a discrete probability
distribution (discovered by Siméon-Denis Poisson (1781-1840) and published, together with his probability
theory, in 1838 in his work Recherches sur la probabilité des jugements en matières
criminelles et matière civile) belonging to certain random
variables N that count, among other things, a number of discrete occurrences (sometimes called "arrivals") that take
place during a time-interval of given length. The probability that there are exactly
k occurrences (k being a natural number including 0,
k = 0, 1, 2, ...) is:
-
Where:
- e is the base of the natural logarithm (e = 2.71828...),
- k! is the factorial of k,
- λ is a positive real number, equal to
the expected number of occurrences that occur during the given interval. For instance, if the events occur on average every 2
minutes, and you are interested in the number of events occcurring in a 10 minute
interval, you would use as model a Poisson distribution with λ = 5.
Sometimes λ is taken to be the rate, i.e., the average number of occurrences per
unit time. In that case, if Nt is the number of occurrences before time t then we have
-
and the waiting time T until the first occurrence is a continuous random variable with an exponential distribution; this probability distribution may be deduced from the fact
that
- P(T > t) = P(Nt = 0).
Occurrence
The Poisson distribution arises in connection with Poisson
processes. It applies to various phenomena of discrete nature (that is, those that may happen 0, 1, 2, 3, ... times during a
given period of time or in a given area) whenever the probability of the phenomenon happening is constant in time or space. Examples include:
- The number of unstable nuclei that decayed within a given period of
time in a piece of radioactive substance.
- The number of cars that pass through a certain point on a road during a given period of time.
- The number of spelling mistakes a secretary makes while typing a single page.
- The number of phone calls you get per day.
- The number of times your web server is accessed per minute.
- For instance, the number of edits per hour recorded on Wikipedia's Recent Changes page follows an approximately Poisson distribution.
- The number of roadkill you find per unit length of road.
- The number of mutations in a given stretch of DNA after a certain amount of radiation.
- The number of pine trees per square mile of mixed forest.
- The number of stars in a given volume of space.
- The number of soldiers killed by horse-kicks each year in each corps in the Prussian cavalry (an example made famous by a
book of Ladislaus Josephovich Bortkiewicz (1868-1931)).
- The number of bombs falling on each square mile of London during a German air raid in the early part of the Second
World War.
How does this distribution arise? -- The limit theorem
The binomial distribution with parameters n
and λ/n, i.e., the probability distribution of the number of successes in n trials, with probability
λ/n of success on each trial, approaches the Poisson distribution with expected value λ as n
approaches infinity.
Here are the details. First, recall from calculus that
-
Let p = λ/n. Then we have
-
-
As n approaches ∞, the expression over the first of the four approaches 1; the expression over the
second underbrace remains constant since "n" does not appear in it at all; the expression over the third underbrace
approaches e−λ; and the one over the fourth underbrace approaches 1.
Consequently the limit is
-
Properties
The expected value of a Poisson distributed random variable is equal
to λ and so is its variance. The higher moments of the Poisson distribution are Touchard polynomials in λ, whose coefficients have a combinatorial meaning.
The most likely value ("mode") of a Poisson distributed random variable is equal to the largest integer ≤ λ, which
is also written as floor(λ).
If λ is big enough (λ > 1000 say), then the normal distribution with mean λ and standard deviation √ λ is an excellent
approximation to the Poisson distribution. If λ > about 10, then the normal distribution is a good approximation if an
appropriate continuity correction is done, i.e.,
P(X ≤ x), where (lower-case) x is a non-negative integer, is replaced by P(X ≤
x + 0.5).
If N and M are two independent
random variables, both following a Poisson distribution with parameters λ and μ, respectively, then N +
M follows a Poisson distribution with parameter λ + μ.
The moment-generating function of the
Poisson distribution with expected value λ is
-
All of the cumulants of the Poisson distribution are equal to the expected value
λ.
The "law of small numbers"
The word law is sometimes used as a synonym of probability distribution, and convergence in law means convergence in distribution. Accordingly, the Poisson
distribution is sometimes called the law of small numbers because it is the probability distribution of the
number of occurrences of an event that happens rarely but has very many opportunities to happen. The Law of Small
Numbers is a book by Ladislaus Bortkiewicz about the
Poisson distribution, published in 1898. Some historians of mathematics have argued that
the Poisson distribution should have been called the Bortkiewicz distribution.
|