|
In mathematics, the binomial distribution is a discrete
probability distribution which describes the number
of successes in a sequence of n independent yes/no experiments, each of which yielding success with probability p. Such a success/failure experiment is also called a Bernoulli experiment or
Bernoulli trial.
A typical example is the following: 5% of the population are HIV-positive. You pick 500 people randomly. How likely is it that
you get 30 or more HIV-positives? The number of HIV-positives you pick is a random variable X which follows a binomial distribution with n = 500 and p =
.05. We are interested in the probability Pr[X ≥ 30].
In general, if the random variable X follows the binomial distribution with parameters n and p, we
write X ~ B(n, p). The probability of getting exactly k successes is given by
-
where
-
is the binomial coefficient "n choose
k" (also denoted C(n, k)), whence the name of the distribution. The formula can be understood
as follows: we want k successes (pk) and n − k failures ((1 −
p)n − k). However, the k successes can occur anywhere among the n
trials, and there are C(n, k) different ways of distributing k successes in a sequence of n
trials.
If X ~ B(n, p), then the expected value
of X is
-
and the variance is
-
The most likely value or mode of X is given by the largest integer less than or
equal to (n+1)p; if m = (n+1)p is itself an integer, then m − 1 and
m are both modes.
If X ~ B(n, p) and Y ~ B(m, p) are independent binomial variables, then
X + Y is again a binomial variable; its distribution is
-
Two other important distributions arise as approximations of binomial distributions:
Binomial PDF and Normal approximation for n=6 and p=0.5.
- If both np and n(1 − p) are greater than 5 or so, then an excellent approximation (provided
a suitable continuity correction is used) to
B(n, p) is given by the normal
distribution
-
-
- This approximation is a huge time-saver; historically, it was the first use of the normal distribution, introduced in
Abraham de Moivre's book The Doctrine of Chances in 1733. Nowadays, it can be
seen as a consequence of the central limit theorem since
B(n, p) is a sum of n independent, identically distributed 0-1 indicator variables. Warning: this approximation gives inaccurate results unless
a continuity correction is used.Note:
that the picture gives the normal and binomial probability density functions (PDF) and not the cumulative distribution functions.
- For example, suppose you randomly sample n people out of a large population and ask them whether they agree with a
certain statement. The proportion of people who agree will of course depend on the sample. If you sampled groups of n
people repeatedly and truly randomly, the proportions would follow an approximate normal distribution with mean equal to the true
proportion p of agreement in the population and with standard deviation σ = (p(1 −
p)/n)1/2. Large sample sizes n are good because the standard deviation gets smaller, which
allows a more precise estimate of the unknown parameter p.
- If n is large and p is small, so that np is of moderate size, then the Poisson distribution with parameter λ = np is a good
approximation to B(n, p).
The formula for Bézier curves was inspired by the binomial
distribution.
|