Binomial Distribution

Binomial distribution is the distribution of a total number of successes in a given number of Bernoulli trials. The common notation is b(k; n, p), where k is the number of successes, n is the number of trials, p is the probability of success. We know that b(k; n, p) = C(n, k) p^k(1 - p)^{n - k}.

If you are reading this, your browser is not set to run Java applets. Try IE11 or Safari and declare the site https://www.cut-the-knot.org as trusted in the Java setup.

What if applet does not run?

For a fixed number of trials n, the binomial distribution always behaves in the same way: as a function of k, it is monotone increasing up to a certain point m after which (perhaps with an exception of the next point) it is monotone decreasing.

Indeed,

	b(k; n, p) / b(k-1; n, p)	= C(n, k) p^k(1 - p)^{n - k} / C(n, k-1) p^k-1(1 - p)^{n - k + 1}
(1)		= (n - k + 1) p / k(1 - p)
		= 1 + ((n + 1) p - k) / k(1 - p).

It is clear now that the right hand side in (1) is greater than 1 whenever k > (n + 1) p and it is less than 1 when k < (n + 1) p. It may happen, of course, that m = (n + 1) p is an integer in which case

b(m; n, p) = b(m - 1; n, p).

In any event, there is only one integer m that satisfies

(2)	(n + 1) p - 1 < m ≤ (n + 1) p.

Summing up, as a function of k, the expression b(k; n, p) is monotone increasing for k < m and monotone decreasing for k > m, with the exception of one case where (n + 1) p is an integer. In this case, there are two maximum values for m = (n + 1) p and m - 1.

The number m that satisfies (2) is known as the most probable (most likely) number of successes in n Bernoulli trials. As a matter of fact, b(m; n, p) is quite small for a large n, even for a reasonable value of p. It is also always different from the average number of successes. The latter could be found the following way. Letting q = 1 - p, we get

	∑ k b(k; n, p)	= ∑ k C(n, k) p^kq^{n - k}
		= pq^n-1∑ k C(n, k) (p/q)^{k - 1}
		= pq^n-1 n (1 + p/q)^{n - 1}
		= pq^n-1 n (q + p)^{n - 1} / q^{n - 1}
		= p n,

where we used the identity

n (1 + x)^{n - 1}

= ∑ k C(n, k) x^{k - 1}.

The result for the expected value np = ∑ k b(k; n, p) might have been anticipated given the interpretation of the probability as a relative frequency.

Note that the expected value np is always different from the most likely value (n + 1) p, provided of course p ≠ 0.

73533845