Cut the knot: learn to enjoy mathematics
A math books store at a unique math study site. Learn to enjoy mathematics.
Google
Web CTK
Best sites for teachers
Sites for teachers
Sites for parents
Terms of use
Awards

Interactive Activities
CTK Exchange
CTK Insights - a blog

Games & Puzzles
What Is What
Arithmetic/Algebra
Geometry
Probability
Outline Mathematics
Make an Identity
Book Reviews
Eye Opener
Analog Gadgets
Inventor's Paradox
Did you know?...
Proofs
Math as Language
Things Impossible
Visual Illusions
My Logo
Math Poll
Cut The Knot!
MSET99 Talk
Other Math sites
Front Page
Movie shortcuts
Personal info
Reciprocal links
Privacy Policy

Guest book
News sites

Recommend this site

Best sites for teachers
Sites for teachers
Sites for parents

Education & Parenting

Manifesto: what CTK is about Search CTK Buying a book is a commitment to learning Table of content Things you can find on CTK Chronology of updates Email to Cut The Knot Recommend this page

Expectation

In a sample space of equiprobable outcomes, the probability of an event is the ratio of the number of favorable outcomes to the total size of the space. This means that the probabilities of events are defined in relation to eacher other. If there is a finite number of exhaustive and mutually exclusive events Ak, k = 1, 2, ..., K, with nk being the number of favorable outcomes in Ak then

  P(Ak) = nk / N,

where N = n1 + n2 + ... + nK.

Borrowing an example from the classic text by W. Feller, in a certain population, nk is the number of families with k children. N is then the total number of families and, assuming the most benign of circumstance, there are 2N adults. But how many children are there? To state the obvious, each of the nk families with k kids has k kids, so the number of kids in such families is knk. The total number of the kids is the sum of such products over all the various family sizes:

  T = 1·n1 + 2·n2 + ... + k·nk + ... + K·nK,

where K is the number of children in the largest family. On average, every family has E = T/N kids.

 
E= T/N
 = 1·n1/N + 2·n2/N + ... + K·nK/N
 = 1·p1 + 2·p2 + ... + K·pK
 = Σk·pk

where pk = nk / N is the probability for a family to have k children. This quantity is one of the most important in the theory of probabilities. We shall give a more general definition.

Let X be a random variable that takes values xk with the probabilities pk: P(X = xk) = pk. The sum

  E(X) = Σxkpk

is known as the mathematical expectation of X (and often the expected value or the mean).

As the above example of counting kids in families of various sizes shows, the mathematical expectation of an RV is in a sense an average value of that random variable. For the die rolling experiment, let Y be the RV showing the top number of a die. Then

  P(Y = k) = 1/6, k = 1, 2, ..., 6.

By definition,

 
E(Y)= 1·1/6 + 2·1/6 + ... + 6·1/6
 = (1 + 2 + ... + 6)/6
 = 21/6
 = 3.5

which is exactly the average, i.e., the arithmetic mean, of the numbers 1, 2, ..., 6.

Let's apply the notion of mathematical expectation to the example of a novice player seeking admittance to a tennis club. To be admitted, the fellow had to beat in two successive games members G (good) and T (top) of the club. With probabilities g and t (t < g) of winning against G and T, the fellow had to choose between to possible orders of games: GTG or TGT. Paradoxically, the second choice appeared to be preferable gaining the fellow the membership with the probability gt(2 - t) against the smaller gt(2 - g) for the sequence GTG.

We shall be looking for the expected number of wins. Using L for a loss and W for a win for the aspiring novice, we shall consider two sample spaces. Following Havil, the space consists of 8 possible outcomes of a sequence of three games:

  LLL, LLW, LWL, LWW, WLL, WLW, WWL, WWW

However note that in the sequences LLL, LLW, WLL, WLW the third game is superfluous as the result of the first two make it impossible for the fellow to win two successive games, whereas the third game is unnecessary in the last two sequences WWL, WWW because the two first wins already gain the fellow admittance to the club. This makes possible and reasonable to consider a smaller sample space:

  LL, LWL, LWW, WL, WW

For the sequence TGT we have the following probabilities:

 Win/Loss sequenceProbability
 LLL(1 - t)(1 - g)(1 - t)
 LLW(1 - t)(1 - g)t
 LWL(1 - t)g(1 - t)
 LWW(1 - t)gt
 WLLt(1 - g)(1 - t)
 WLWt(1 - g)t
 WWLtg(1 - t)
 WWWtgt

for the first sample space and

 Win/Loss sequenceProbability
 LL(1 - t)(1 - g)
 LWL(1 - t)g(1 - t)
 LWW(1 - t)gt
 WLt(1 - g)
 WWtg

for the second. In both cases, the probabilities add up to 1, as required. Choosing the easier way out, we verify this only for the latter:

 (1 - t)(1 - g) + (1 - t)g(1 - t) + (1 - t)gt + t(1 - g) + tg
    = (1 - t)(1 - g) + [(1 - t)g(1 - t) + (1 - t)gt] + t(1 - g) + tg
    = (1 - t)(1 - g) + (1 - t)g + t(1 - g) + tg
    = [(1 - t)(1 - g) + t(1 - g)] + [(1 - t)g + tg]
    = (1 - g) + g
    = 1.

Now we introduce the random variable N that denotes the number of wins for the candidate. In the first case, N may be 0, 1, 2, or 3; in the second case, the are only three possible values: 0, 1, 2. The expectations E1 and E2 are

 E1(N, TGT)= 0·(1 - t)(1 - g)(1 - t)
     + 1·(1 - t)(1 - g)t
     + 1·(1 - t)g(1 - t)
     + 2·(1 - t)gt
     + 1·t(1 - g)(1 - t)
     + 2·t(1 - g)t
     + 2·tg(1 - t)
     + 3·tgt
  = 2t + g

and, correspondingly,

 E2(N, TGT)= 0·(1 - t)(1 - g)
     + 1·(1 - t)g(1 - t)
     + 2·(1 - t)gt
     + 1·t(1 - g)
     + 2·tg
  = t + g + tg - t2g.

Similarly,

  E1(N, GTG) = t + 2g and
E2(N, GTG) = t + g + tg - tg2.

Since t < g, we see that

  E1(N, TGT) < E1(N, GTG),

as expected (pun intended). We also have

  E2(N, TGT) < E2(N, GTG),

which ameliorates the paradoxical situation that arose from the pure count of probabilities. Although, the probability of gaining the membership playing the top guy first is larger then when playing first just a good member, the expected number of the wins is greater when postponing the confrontation with the top player.

The expectation has several algebraic properties that make it a linear function:

  E(X + Y) = E(X) + E(Y) and
E(αX) = αE(X),

where X and Y are RV and α is a real number. For a constant random variable C that only takes on the value c, the expectation is exactly that value: E(C) = c. If, for a given RV X, Y = X - E(X), then

  E(Y) = E(X - E(X)) = E(X) - E(E(X)) = E(X) - E(X) = 0,

since E(X) is a constant, a constant RV.

References

  1. W. Feller, An Introduction to Probability Theory and Its Applications, Vol.1, John Wiley & Sons; 2nd edition (1958)
  2. J. Havil, Nonplussed!, Princeton University Press, 2007

Copyright © 1996-2008 Alexander Bogomolny

28739278Page copy protected against web site content infringement by Copyscape


Search:
Keywords:


Latest on CTK Exchange
Math
Posted by Laura
2 messages
06:56 AM, Apr-15-08

Divisibility rules - Jargon buste ...
Posted by Carolyn
2 messages
08:35 AM, Apr-04-08

drawing puzzle
Posted by martin gran
31 messages
06:53 PM, May-09-08

conway's game of life
Posted by frequency
0 messages
11:52 PM, May-12-08

Mistake on the page (an aside, Be ...
Posted by Max
4 messages
10:28 AM, Feb-28-08

Deriving functions based on diffe ...
Posted by ke_45
1 messages
12:47 PM, May-10-08

Josephus Flavius (correction)
Posted by David Turner
1 messages
09:42 AM, May-14-08