Bayes' Theorem

Bayes' theorem (or Bayes' Law and sometimes Bayes' Rule) is a direct application of conditional probabilities. The probability P(A|B) of "A assuming B" is given by the formula

P(A|B) = P(A∩B) / P(B)

which for our purpose is better written as

P(A∩B) = P(A|B)·P(B).

The left hand side P(A∩B) depends on A and B in a symmetric manner and would be the same if we started with P(B|A) instead:

P(B|A)·P(A) = P(A∩B) = P(A|B)·P(B).

This is actually what Bayes' theorem is about:

(1) P(B|A) = P(A|B)·P(B) / P(A).

Most often, however, the theorem appears in a somewhat different form

(1') P(B|A) = P(A|B)·P(B) / (P(A|B)P(B) + P(A|B)P(B)),

where B is an event complementary to B: B∪B = Ω, the universal event. (Of course also B∩B = Φ, an empty event.)

This is because

A= A ∩ (B ∪ B)
 = A∩B ∪ A∩B

and, since A∩B and A∩B are mutually exclusive,

P(A)= P(A∩B ∪ A∩B)
 = P(A∩B) + P(A∩B)
 = P(A|B)P(B) + P(A|B)P(B).

More generally, for a finite number of mutually exclusive and exhaustive events Hi (i = 1, ..., n), i.e. events that satisfy

Hk ∩ Hm = Φ, for k ≠ m and
H1 ∪ H2 ∪ ... ∪ Hn = Ω,

Bayes' theorem states that

P(Hk|A) = P(A|Hk) P(Hk) / ∑i P(A|Hi) P(Hi),

where the sum is taken over all i = 1, ..., n.

We shall consider several examples.

Example 1. Monty Hall Problem. [Havil, pp. 61-63]

Let A, B, C denote the events "the car is behind door A (or #1)", "the car is behind the door B (or #2)", "the car is behine the door C (or #3)." Let also MA denote the event of Monty opening door A, etc.

You are called on stage and point to door A, say. Then

P(MB|A) = 1/2, because Monty has to choose between
two carless doors, B and C
P(MB|B) = 0, because Monty never opens the door with a car behind
P(MB|C) = 1, for the very same reason that P(MB|B) = 0.

Since A, B, C are mutually exclusive and exhaustive,

P(MB)= P(MB|A)P(A) + P(MB|B)P(B) + P(MB|C)P(C)
 = 1/2 × 1/3 + 0 × 1/3 + 1 × 1/3
 = 1/2.

Now you are given a chance to switch to another door, B or C (depending on which one remains closed.) If you stick with your original selection (A),

P(A|MB)= P(MB|A)P(A)/P(MB)
 = 1/2 × 1/3  /   1/2
 = 1/3.

However, if you switch,

P(C|MB)= P(MB|C)P(C)/P(MB)
 = 1 × 1/3  /  1/2
 = 2/3.

You'd be remiss not to switch.

Example 2. Sick Child and Doctor. [Falk, p. 48]

A doctor is called to see a sick child. The doctor has prior information that 90% of sick children in that neighborhood have the flu, while the other 10% are sick with measles. Let F stand for an event of a child being sick with flu and M stand for an event of a child being sick with measles. Assume for simplicity that F ∪ M = Ω, i.e., that there no other maladies in that neighborhood.

A well-known symptom of measles is a rash (the event of having which we denote R). P(R|M) = .95. However, occasionally children with flu also develop rash, so that P(R|F) = 0.08.

Upon examining the child, the doctor finds a rash. What is the probability that the child has measles?


Example 3. Incidence of Breast Cancer. [von Savant, pp. 103-104]

In a study, physicians were asked what the odds of breast cancer would be in a woman who was initially thought to have a 1% risk of cancer but who ended up with a positive mammogram result (a mammogram accurately classifies about 80% of cancerous tumors and 90% of benign tumors.) 95 out of a hundred physicians estimated the probability of cancer to be about 75%. Do you agree?



  1. R. B. Ash, Basic Probability Theory, Dover, 2008
  2. R. Falk, Understanding Probability and Statistics, A K Peters, 1993
  3. J. Havil, Impossible?, Princeton University Press, 2008
  4. M. vos Savant, The Power of Logical Thinking, St. Martin's Press, NY 1996

|Contact| |Front page| |Contents| |Up|

Copyright © 1996-2018 Alexander Bogomolny

Sick Child and Doctor (Solution)

 P(M|R)= P(R|M) P(M) / (P(R|M) P(M) + P(R|F) P(F))
  = .95 × .10 / (.95 × .10 + .08 × .90)
  ≈ 0.57.

Which is nowhere close to 95% of P(R|M).

|Contact| |Front page| |Contents| |Up|

Copyright © 1996-2018 Alexander Bogomolny

Incidence of Breast Cancer (Solution)

Introduce the events:

 P- mammogram result is positive,
 B- tumor is benign,
 M- tumor is malignant.

Bayes' formula in this case is

 P(M|P)= P(P|M) P(M) / (P(P|M) P(M) + P(P|B) P(B))
  = .80 × .01 / (.80 × .01 + .10 × .99)
  ≈ 0.075
  = 7.5%.

A far cry from a common estimate of 75%.

|Contact| |Front page| |Contents| |Up|

Copyright © 1996-2018 Alexander Bogomolny