Expectation
In a sample space of equiprobable outcomes, the probability of an event is the ratio of the number of favorable outcomes to the total size of the space. This means that the probabilities of events are defined in relation to eacher other. If there is a finite number of exhaustive and mutually exclusive events Ak, k = 1, 2, ..., K, with nk being the number of favorable outcomes in Ak then
P(Ak) = nk / N, |
where N = n1 + n2 + ... + nK.
Borrowing an example from the classic text by W. Feller, in a certain population, nk is the number of families with k children. N is then the total number of families and, assuming the most benign of circumstance, there are 2N adults. But how many children are there? To state the obvious, each of the nk families with k kids has k kids, so the number of kids in such families is knk. The total number of the kids is the sum of such products over all the various family sizes:
T = 1·n1 + 2·n2 + ... + k·nk + ... + K·nK, |
where K is the number of children in the largest family. On average, every family has
|
where pk = nk / N is the probability for a family to have k children. This quantity is one of the most important in the theory of probabilities. We shall give a more general definition.
Let X be a random variable that takes values xk with the probabilities pk:
E(X) = Σxkpk |
is known as the mathematical expectation of X (and often the expected value or the mean).
As the above example of counting kids in families of various sizes shows, the mathematical expectation of an RV is in a sense an average value of that random variable. For the die rolling experiment, let Y be the RV showing the top number of a die. Then
P(Y = k) = 1/6, k = 1, 2, ..., 6. |
By definition,
|
which is exactly the average, i.e., the arithmetic mean, of the numbers

Let's apply the notion of mathematical expectation to the example of a novice player seeking admittance to a tennis club. To be admitted, the fellow had to beat in two successive games members G (good) and T (top) of the club. With probabilities g and t
We shall be looking for the expected number of wins. Using L for a loss and W for a win for the aspiring novice, we shall consider two sample spaces. Following Havil, the space consists of 8 possible outcomes of a sequence of three games:
LLL, LLW, LWL, LWW, WLL, WLW, WWL, WWW |
However note that in the sequences LLL, LLW, WLL, WLW the third game is superfluous as the result of the first two make it impossible for the fellow to win two successive games, whereas the third game is unnecessary in the last two sequences WWL, WWW because the two first wins already gain the fellow admittance to the club. This makes possible and reasonable to consider a smaller sample space:
LL, LWL, LWW, WL, WW |
For the sequence TGT we have the following probabilities:
Win/Loss sequence | Probability | |
---|---|---|
LLL | (1 - t)(1 - g)(1 - t) | |
LLW | (1 - t)(1 - g)t | |
LWL | (1 - t)g(1 - t) | |
LWW | (1 - t)gt | |
WLL | t(1 - g)(1 - t) | |
WLW | t(1 - g)t | |
WWL | tg(1 - t) | |
WWW | tgt |
for the first sample space and
Win/Loss sequence | Probability | |
---|---|---|
LL | (1 - t)(1 - g) | |
LWL | (1 - t)g(1 - t) | |
LWW | (1 - t)gt | |
WL | t(1 - g) | |
WW | tg |
for the second. In both cases, the probabilities add up to 1, as required. Choosing the easier way out, we verify this only for the latter:
(1 - t)(1 - g) + (1 - t)g(1 - t) + (1 - t)gt + t(1 - g) + tg | |
= (1 - t)(1 - g) + [(1 - t)g(1 - t) + (1 - t)gt] + t(1 - g) + tg | |
= (1 - t)(1 - g) + (1 - t)g + t(1 - g) + tg | |
= [(1 - t)(1 - g) + t(1 - g)] + [(1 - t)g + tg] | |
= (1 - g) + g | |
= 1. |
Now we introduce the random variable N that denotes the number of wins for the candidate. In the first case, N may be 0, 1, 2, or 3; in the second case, the are only three possible values: 0, 1, 2. The expectations E1 and E2 are
E1(N, TGT) | = 0·(1 - t)(1 - g)(1 - t) | |
+ 1·(1 - t)(1 - g)t | ||
+ 1·(1 - t)g(1 - t) | ||
+ 2·(1 - t)gt | ||
+ 1·t(1 - g)(1 - t) | ||
+ 2·t(1 - g)t | ||
+ 2·tg(1 - t) | ||
+ 3·tgt | ||
= 2t + g |
and, correspondingly,
E2(N, TGT) | = 0·(1 - t)(1 - g) | |
+ 1·(1 - t)g(1 - t) | ||
+ 2·(1 - t)gt | ||
+ 1·t(1 - g) | ||
+ 2·tg | ||
= t + g + tg - t2g. |
Similarly,
E1(N, GTG) = t + 2g and E2(N, GTG) = t + g + tg - tg2. |
Since t < g, we see that
E1(N, TGT) < E1(N, GTG), |
as expected (pun intended). We also have
E2(N, TGT) < E2(N, GTG), |
which ameliorates the paradoxical situation that arose from the pure count of probabilities. Although, the probability of gaining the membership playing the top guy first is larger than when playing first just a good member, the expected number of the wins is greater when postponing the confrontation with the top player.

The expectation has several algebraic properties that make it a linear function:
E(X + Y) = E(X) + E(Y) and E(αX) = αE(X), |
where X and Y are RV and α is a real number. For a constant random variable C that only takes on the value c, the expectation is exactly that value:
E(Y) = E(X - E(X)) = E(X) - E(E(X)) = E(X) - E(X) = 0, |
since E(X) is a constant, a constant RV.
References
- R. B. Ash, Basic Probability Theory, Dover, 2008
- W. Feller, An Introduction to Probability Theory and Its Applications, Vol.1, John Wiley & Sons; 2nd edition (1958)
- J. Havil, Nonplussed!, Princeton University Press, 2007

- What Is Probability?
- Intuitive Probability
- Probability Problems
- Sample Spaces and Random Variables
- Probabilities
- Conditional Probability
- Dependent and Independent Events
- Algebra of Random Variables
- Expectation
- Probability Generating Functions
- Probability of Two Integers Being Coprime
- Random Walks
- Probabilistic Method
- Probability Paradoxes
- Symmetry Principle in Probability
- Non-transitive Dice


|Contact| |Front page| |Contents| |Up|
Copyright © 1996-2018 Alexander Bogomolny
70768293