Negative binomial distribution |
In probability theory, the negative binomial
distribution is the probability
distribution of the number of Bernoulli trials needed to get a
fixed (i.e., non-random) number of successes in a Bernoulli
process. If the random variable X is the number of
trials needed to get r successes in a series of trials where each trial has success probability p, then
X follows the negative binomial distribution with parameters r and p. Thus the family of all negative
binomial distributions is parametrized by two parameters, r and p. The first parameter is a positive integer; the second parameter is a real
number between 0 and 1. If r = 1, then X has a geometric distribution.
Formulas
- Parameters : r (number of successes) is an integer where 1
≤ r; the special case r = 1 creates the geometric distribution.
- p = probability of success on each trial is a real number
where 0 < p < 1.
- Support (domain where probability mass > 0) = set of all integers ≥ r.
- Probability mass function
f(x) = P(X = x) = probability that rth success occurs on the xth trial
- = C(x − 1, r − 1) pr(1 − p) x −
r (see binomial coefficient).
- Cumulative distribution
function F(x) = P(X ≤ x) = probability that rth success occurs on or
before the xth trial : No simple closed form solution exists, but this can be computed via the regularized
incomplete Beta function as with the binomial
distribution.
- Expected value E[X] = r/p.
- Variance var(X) = σ2 = r(1 −
p)/p2.
Example
(After a problem by Dr. Diane Evans, professor of mathematics at Rose-Hulman Institute of Technology)
Johnny, a sixth grader at Honey Creek Middle School in Terre Haute, Indiana, is required to sell candy bars in his neighborhood to raise money for the 6th
grade field trip. There are thirty homes in his neighborhood, and his father has told him not to return home until he has sold
five candy bars. So the boy goes door to door, selling candy bars. At each home he visits, he has an 0.4 probability of selling
one candy bar and an 0.6 probability of selling nothing.
What's the probability mass function for selling the last candy bar at the xth house?
- f(x) = C(x − 1, 4) * 0.45 * (1 − 0.4)x − 5
What's the probability that he finishes on the tenth house?
- f(10) = 0.100
What's the probability that he finishes on or before reaching the eighth house?
Answer: To finish on or before the eighth house, he must finish at the fifth, sixth, seventh, or eighth house. Sum those
probabilities:
- f(5) = 0.0102; f(6) = .0307, f(7) = .0553; f(8) = .0774; sum(f(j), j=5..8) = 0.1737
What's the probability that he exhausts all houses in the neighborhood, gives up, and then goes to live on the streets?
-
Moral: Negative binomial distributions don't turn our children out on the streets; bad parenting does.
Properties
If Xr is a random variable following the negative binomial distribution with parameters r and
p, then Xr is a sum of r independent variables following the geometric distribution with parameter p. As a result of the central limit theorem, Xr is therefore
approximately normal for sufficiently large r.
Furthermore, if Ys is a random variable following the binomial distribution with parameters s and p, then
- Pr[Xr ≤ s] = Pr[Ys ≥ r] = Pr["after
s trials, there are at least r successes"]
In this sense, the negative binomial distribution is the "inverse" of the binomial distribution. Every question about
probabilities of negative binomial variables can be translated into an equivalent one about binomial variables.
The negative binomial distribution also arises as a continuous mixture of Poisson distributions for which the Poisson parameter λ was generated by a Gamma distribution.
Explanation of the name
Suppose X is a random variable with a negative binomial distribution with parameters r and p. The
statement that the sum from x = r to infinity, of the probability Pr[X = x], is equal to
1, can be shown by a bit of algebra to be equivalent to the statement that (1 − p)− r is
what Newton's binomial theorem says it should be.
Suppose Y is a random variable with a binomial
distribution with parameters n and p. The statement that the sum from y = 0 to n, of the
probability Pr[Y = y], is equal to 1, says that that 1 = (p + (1 −
p))n is what the strictly finitary binomial theorem of high-school algebra says it should be.
Thus the negative binomial distribution bears the same relationship to the negative-integer-exponent case of the binomial
theorem that the binomial distribution bears to the positive-integer-exponent case.
|