Information about Negative Binomial Distribution

Negative binomial
Probability mass function
The red line represents the mean, and the green line has an approximate length of 2σ.
Cumulative distribution function
Parameters (real)
(real)
Support
Probability mass function (pmf)
Cumulative distribution function (cdf) is the regularized incomplete beta function
Mean
Median
Mode
Variance
Skewness
Excess kurtosis
Entropy
Moment-generating function (mgf)
Characteristic function
In probability and statistics the negative binomial distribution is a discrete probability distribution. The Pascal distribution and the Polya distribution are special cases of the negative binomial. There is a convention among engineers, climatologists, and others to reserve "negative binomial" in a strict sense or "Pascal" (after Blaise Pascal) for the case of an integer-valued parameter r, and use "Polya" (for George Pólya) for the real-valued case, to the right. The Polya distribution more accurately models occurrences of "contagious" discrete events, like tornado outbreaks, than does the Poisson distribution.

Specification of the negative binomial distribution

Probability mass function

The family of negative binomial distributions is a two-parameter family; several parametrizations are in common use. One very common parameterization employs two real-valued parameters p and r with 0 < p < 1 and r > 0. Under this parameterization, the probability mass function of a random variable with a NegBin(r, p) distribution takes the following form:



for k = 0,1,2,... (Γ is the gamma function).

Under an alternative parameterization, let

and


and so the mass function becomes



where λ and ω are nonnegative real parameters. Under this parameterization, we have



which is precisely the mass function of a Poisson-distributed random variable with Poisson rate λ. In other words, the alternatively parameterized negative binomial distribution converges to the Poisson distribution and ω controls the deviation from the Poisson. This makes the negative binomial distribution suitable as a robust alternative to the Poisson, which approaches the Poisson for large ω, but which has larger variance than the Poisson for small ω.

Third, the negative binomial distribution arises as a continuous mixture of Poisson distributions where the mixing distribution of the Poisson rate is a gamma distribution. Formally, this means that the mass function of the negative binomial distribution can also be written as
Because of this, the negative binomial distribution is also known as the gamma-Poisson (mixture) distribution.

Cumulative distribution function

The cumulative distribution function can be expressed in terms of the regularized incomplete beta function:

Occurrence

Waiting time in a Bernoulli process

The NegBin(r, p) distribution is the probability distribution of a certain number of failures and successes in a series of independent and identically distributed Bernoulli trials. Specifically, for k+r Bernoulli trials with success probability p, the negative binomial gives the probability of k failures and r successes, with success on the last trial. In other words, the negative binomial distribution is the probability distribution of the number of failures before the rth success in a Bernoulli process, with probability p of success on each trial.

Consider the following example. Suppose we repeatedly throw a die, and consider a "1" to be a "success". The probability of success on each trial is 1/6. The number of trials needed to get three successes belongs to the infinite set { 3, 4, 5, 6, ... }. That number of trials is a (displaced) negative-binomially distributed random variable. The number of failures before the third success belongs to the infinite set { 0, 1, 2, 3, ... }. That number of failures is also a negative-binomially distributed random variable.

A Bernoulli process is a discrete time process, and so the number of trials, failures, and successes are integers. For the special case where r is an integer, the negative binomial distribution is known as the Pascal distribution. In this case the gamma function is not needed to express the probability mass function, and factorials or binomial coefficients can be used instead:



A further specialization occurs when r = 1: in this case we get the probability distribution of failures before the first success (i.e. the probability of success on the (k+1)th trial), which is a geometric distribution. To wit:

Overdispersed Poisson

The negative binomial distribution, especially in its alternative parameterization described above, can be used as an alternative to the Poisson distribution. It is especially useful for discrete data over an unbounded positive range whose sample variance exceeds the sample mean. If a Poisson distribution is used to model such data, the model mean and variance are equal. In that case, the observations are overdispersed with respect to the Poisson model. Since the negative binomial distribution has one more parameter than the Poisson, the second parameter can be used to adjust the variance independently of the mean...

Related distributions

:
:

Properties

Relation to other distributions

If Xr is a random variable following the negative binomial distribution with parameters r and p, then Xr is a sum of r independent variables following the geometric distribution with parameter p. As a result of the central limit theorem, Xr is therefore approximately normal for sufficiently large r.

Furthermore, if Ys+r is a random variable following the binomial distribution with parameters s + r and p, then



In this sense, the negative binomial distribution is the "inverse" of the binomial distribution.

The sum of independent negative-binomially distributed random variables with the same value of the parameter p but the "r-values" r1 and r2 is negative-binomially distributed with the same p but with "r-value" r1 + r2.

The negative binomial distribution is infinitely divisible, i.e., if X has a negative binomial distribution, then for any positive integer n, there exist independent identically distributed random variables X1, ..., Xn whose sum has the same distribution that X has. These will not be negative-binomially distributed in the sense defined above unless n is a divisor of r (more on this below).

Sampling and point estimation of p

Suppose p is unknown and an experiment is conducted where it is decided ahead of time that sampling will continue until r successes are found. The sufficient statistics for the experiment is k, the number of failures.

In estimating p, the minimum variance unbiased point estimator is . One might think the estimator is , but this is biased. Haldane Article

Relation to the binomial theorem

Suppose X is a random variable with a negative binomial distribution with parameters r and p. The statement that the sum from x = r to infinity, of the probability Pr[X = x], is equal to 1, can be shown by a bit of algebra to be equivalent to the statement that (1 − p)r is what Newton's binomial theorem says it should be.

Suppose Y is a random variable with a binomial distribution with parameters n and p. The statement that the sum from y = 0 to n, of the probability Pr[Y = y], is equal to 1, says that 1 = (p + (1 − p))n is what the strictly finitary binomial theorem of rudimentary algebra says it should be.

Thus the negative binomial distribution bears the same relationship to the negative-integer-exponent case of the binomial theorem that the binomial distribution bears to the positive-integer-exponent case.

Assume p + q = 1. Then the binomial theorem of elementary algebra implies that



This can be written in a way that may at first appear to some to be incorrect, and perhaps perverse even if correct:



in which the upper bound of summation is infinite. If the binomial coefficient is defined by



then it does not make sense when x > n, since factorials of negative numbers are not defined. But one may also read it as



In that case it is defined even when n is negative or is not an integer. But in our case of the binomial distribution it is zero when x > n. So why would we write the result in that form, with a seemingly needless sum of infinitely many zeros? The answer comes when we generalize the binomial theorem of elementary algebra to Newton's binomial theorem. Then we can say, for example



Now suppose r > 0 and we use a negative exponent:



Then all of the terms are positive, and the term



is just the probability that the number of failures before the rth success is equal to x, provided r is an integer. (If r is a negative non-integer, so that the exponent is a positive non-integer, then some of the terms in the sum above are negative, so we do not have a probability distribution on the set of all nonnegative integers.)

Now we also allow non-integer values of r. Then we have a proper negative binomial distribution, which is a generalization of the Pascal distribution, which coincides with the Pascal distribution when r happens to be a positive integer.

Recall from above that

The sum of independent negative-binomially distributed random variables with the same value of the parameter p but the "r-values" r1 and r2 is negative-binomially distributed with the same p but with "r-value" r1 + r2.


This property persists when the definition is thus generalized, and affords a quick way to see that the negative binomial distribution is infinitely divisible.

Examples

(After a problem by Dr. Diane Evans, professor of mathematics at Rose-Hulman Institute of Technology)

Pat is required to sell candy bars to raise money for the 6th grade field trip. There are thirty houses in the neighborhood, and Pat is not supposed to return home until five candy bars have been sold. So the child goes door to door, selling candy bars. At each house, there is a 0.4 probability of selling one candy bar and a 0.6 probability of selling nothing.

What's the probability mass function for selling the last candy bar at the nth house?

Recall that the NegBin(r, p) distribution describes the probability of k failures and r successes in k+r Bernoulli(p) trials with success on the last trial. Selling five candy bars means getting five successes. The number of trials (i.e. houses) this takes is therefore k+5 = n. The random variable we are interested in is the number of houses, so we substitute k = n − 5 into a NegBin(5, 0.4) mass function and obtain the following mass function of the distribution of houses (for n ≥ 5):



What's the probability that Pat finishes on the tenth house?



What's the probability that Pat finishes on or before reaching the eighth house?

To finish on or before the eighth house, Pat must finish at the fifth, sixth, seventh, or eighth house. Sum those probabilities:


What's the probability that Pat exhausts all 30 houses in the neighborhood?





Probability distributions    [ edit] ]
Univariate Multivariate
Discrete: Benford • BernoullibinomialBoltzmanncategoricalcompound Poisson • discrete phase-type • degenerateGauss-Kuzmingeometrichypergeometriclogarithmic • negative binomial • parabolic fractalPoissonRademacherSkellamuniformYule-SimonzetaZipfZipf-MandelbrotEwensmultinomialmultivariate Polya
Continuous: BetaBeta primeCauchychi-squareDirac delta function • Coxian • Erlangexponentialexponential powerFfading • Fermi-Dirac • Fisher's zFisher-TippettGammageneralized extreme valuegeneralized hyperbolicgeneralized inverse GaussianHalf-LogisticHotelling's T-squarehyperbolic secanthyper-exponentialhypoexponentialinverse chi-square (scaled inverse chi-square) • inverse Gaussianinverse gamma (scaled inverse gamma) • KumaraswamyLandauLaplace • Lvy • Lvy skew alpha-stablelogisticlog-normal • Maxwell-Boltzmann • Maxwell speedNakagaminormal (Gaussian)normal-gammanormal inverse GaussianParetoPearson • phase-type • polarraised cosineRayleigh • relativistic Breit-Wigner • Riceshifted GompertzStudent's ttriangulartruncated normaltype-1 Gumbeltype-2 GumbeluniformVariance-GammaVoigtvon MisesWeibullWigner semicircleWilks' lambdaDirichletGeneralized Dirichlet distribution . inverse-WishartKentmatrix normalmultivariate normalmultivariate Studentvon Mises-FisherWigner quasiWishart
Miscellaneous: bimodalCantorconditional • equilibrium • exponential family • infinitely divisible • location-scale familymarginalmaximum entropyposterior • prior • quasisamplingsingular
In mathematics, the real numbers may be described informally as numbers that can be given by an infinite decimal representation, such as 2.4871773339…. The real numbers include both rational numbers, such as 42 and −23/129, and irrational numbers, such as π and
..... Click the link for more information.
In mathematics, a support of a function f  from a set X  to the real numbers R is a subset Y of X such that f (x) is zero for all x in X and outside Y.
..... Click the link for more information.
probability mass function (abbreviated pmf) is a function that gives the probability that a discrete random variable is exactly equal to some value. A probability mass function differs from a probability density function (abbreviated pdf
..... Click the link for more information.
In probability theory, the cumulative distribution function (CDF), also called probability distribution function or just distribution function,[1] completely describes the probability distribution of a real-valued random variable X.
..... Click the link for more information.
beta function, also called the Euler integral of the first kind, is a special function defined by



for Re(x), Re(y) > 0.

The beta function was studied by Euler and Legendre and was given its name by Jacques Binet.
..... Click the link for more information.
expected value (or mathematical expectation, or mean) of a discrete random variable is the sum of the probability of each possible outcome of the experiment multiplied by the outcome value (or payoff).
..... Click the link for more information.
median is described as the number separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking
..... Click the link for more information.
In statistics, mode means the most frequent value assumed by a random variable, or occurring in a sampling of a random variable. The term is applied both to probability distributions and to collections of experimental data.
..... Click the link for more information.
variance of a random variable (or somewhat more precisely, of a probability distribution) is one measure of statistical dispersion, averaging the squared distance of its possible values from the expected value.
..... Click the link for more information.
skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable.

Introduction

Consider the distribution in the figure. The bars on the right side of the distribution taper differently than the bars on the left side.
..... Click the link for more information.
kurtosis (from the Greek word kurtos, meaning bulging) is a measure of the "peakedness" of the probability distribution of a real-valued random variable. Higher kurtosis means more of the variance is due to infrequent extreme deviations, as opposed to frequent
..... Click the link for more information.
Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable.

Shannon entropy quantifies the information contained in a piece of data: it is the minimum average message length, in bits (if using base-2 logarithms), that must
..... Click the link for more information.
In probability theory and statistics, the moment-generating function of a random variable X is



wherever this expectation exists. The moment-generating function generates the moments of the probability distribution.
..... Click the link for more information.
In probability theory, the characteristic function of any random variable completely defines its probability distribution. On the real line it is given by the following formula, where X is any random variable with the distribution in question:


..... Click the link for more information.
Probability is the likelihood that something is the case or will happen. Probability theory is used extensively in areas such as statistics, mathematics, science and philosophy to draw conclusions about the likelihood of potential events and the underlying mechanics of
..... Click the link for more information.
Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It is applicable to a wide variety of academic disciplines, from the physical and social sciences to the humanities.
..... Click the link for more information.
discrete if it is characterized by a probability mass function. Thus, the distribution of a random variable X is discrete, and X is then called a discrete random variable, if



as u
..... Click the link for more information.
Blaise Pascal (pronounced [blɛːz paskal]), (June 19 1623 – August 19 1662) was a French mathematician, physicist, and religious philosopher. He was a child prodigy who was educated by his father.
..... Click the link for more information.
George Pólya (December 13, 1887 – September 7, 1985, in Hungarian Pólya György) was a Hungarian mathematician.

Life and works

He was born as György Pólya in Budapest, Hungary, and died in Palo Alto, USA.
..... Click the link for more information.
George Pólya (December 13, 1887 – September 7, 1985, in Hungarian Pólya György) was a Hungarian mathematician.

Life and works

He was born as György Pólya in Budapest, Hungary, and died in Palo Alto, USA.
..... Click the link for more information.
Poisson distribution is a discrete probability distribution that expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate, and are independent of the time since the last event.
..... Click the link for more information.
In mathematics, the real numbers may be described informally as numbers that can be given by an infinite decimal representation, such as 2.4871773339…. The real numbers include both rational numbers, such as 42 and −23/129, and irrational numbers, such as π and
..... Click the link for more information.
probability mass function (abbreviated pmf) is a function that gives the probability that a discrete random variable is exactly equal to some value. A probability mass function differs from a probability density function (abbreviated pdf
..... Click the link for more information.
A random variable is an abstraction of the intuitive concept of chance into the theoretical domains of mathematics, forming the foundations of probability theory and mathematical statistics.
..... Click the link for more information.
Gamma function (represented by the capitalized Greek letter Γ) is an extension of the factorial function to real and complex numbers. For a complex number z with positive real part it is defined by


..... Click the link for more information.
Poisson distribution is a discrete probability distribution that expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate, and are independent of the time since the last event.
..... Click the link for more information.
in distribution, if
:
for every real number a at which F is continuous. Since F(a) = Pr(X ≤ a), this means that the probability that the value of X
..... Click the link for more information.
gamma distribution is a two-parameter family of continuous probability distributions. It has a scale parameter θ and a shape parameter k. If k is an integer then the distribution represents the sum of k
..... Click the link for more information.
In probability theory, the cumulative distribution function (CDF), also called probability distribution function or just distribution function,[1] completely describes the probability distribution of a real-valued random variable X.
..... Click the link for more information.
beta function, also called the Euler integral of the first kind, is a special function defined by



for Re(x), Re(y) > 0.

The beta function was studied by Euler and Legendre and was given its name by Jacques Binet.
..... Click the link for more information.


This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus


page counter