# Bayes' Theorem

Bayes' theorem (or Bayes' Law and sometimes Bayes' Rule) is a direct application of conditional probabilities. The probability P(A|B) of "A assuming B" is given by the formula

P(A|B) = P(A∩B) / P(B)

which for our purpose is better written as

P(A∩B) = P(A|B)·P(B).

The left hand side P(A∩B) depends on A and B in a symmetric manner and would be the same if we started with P(B|A) instead:

P(B|A)·P(A) = P(A∩B) = P(A|B)·P(B).

This is actually what Bayes' theorem is about:

 (1) P(B|A) = P(A|B)·P(B) / P(A).

Most often, however, the theorem appears in a somewhat different form

 (1') P(B|A) = P(A|B)·P(B) / (P(A|B)P(B) + P(A|B)P(B)),

where B is an event complementary to B: B∪B = Ω, the universal event. (Of course also B∩B = Φ, an empty event.)

This is because

 A = A ∩ (B ∪ B) = A∩B ∪ A∩B

and, since A∩B and A∩B are mutually exclusive,

 P(A) = P(A∩B ∪ A∩B) = P(A∩B) + P(A∩B) = P(A|B)P(B) + P(A|B)P(B).

More generally, for a finite number of mutually exclusive and exhaustive events Hi (i = 1, ..., n), i.e. events that satisfy

Hk ∩ Hm = Φ, for k ≠ m and
H1 ∪ H2 ∪ ... ∪ Hn = Ω,

Bayes' theorem states that

P(Hk|A) = P(A|Hk) P(Hk) / ∑i P(A|Hi) P(Hi),

where the sum is taken over all i = 1, ..., n.

We shall consider several examples.

Example 1. Monty Hall Problem. [Havil, pp. 61-63]

Let A, B, C denote the events "the car is behind door A (or #1)", "the car is behind the door B (or #2)", "the car is behine the door C (or #3)." Let also MA denote the event of Monty opening door A, etc.

You are called on stage and point to door A, say. Then

 P(MB|A) = 1/2, because Monty has to choose betweentwo carless doors, B and C P(MB|B) = 0, because Monty never opens the door with a car behind P(MB|C) = 1, for the very same reason that P(MB|B) = 0.

Since A, B, C are mutually exclusive and exhaustive,

 P(MB) = P(MB|A)P(A) + P(MB|B)P(B) + P(MB|C)P(C) = 1/2 × 1/3 + 0 × 1/3 + 1 × 1/3 = 1/2.

Now you are given a chance to switch to another door, B or C (depending on which one remains closed.) If you stick with your original selection (A),

 P(A|MB) = P(MB|A)P(A)/P(MB) = 1/2 × 1/3  /   1/2 = 1/3.

However, if you switch,

 P(C|MB) = P(MB|C)P(C)/P(MB) = 1 × 1/3  /  1/2 = 2/3.

You'd be remiss not to switch.

Example 2. Sick Child and Doctor. [Falk, p. 48]

A doctor is called to see a sick child. The doctor has prior information that 90% of sick children in that neighborhood have the flu, while the other 10% are sick with measles. Let F stand for an event of a child being sick with flu and M stand for an event of a child being sick with measles. Assume for simplicity that F ∪ M = Ω, i.e., that there no other maladies in that neighborhood.

A well-known symptom of measles is a rash (the event of having which we denote R). P(R|M) = .95. However, occasionally children with flu also develop rash, so that P(R|F) = 0.08.

Upon examining the child, the doctor finds a rash. What is the probability that the child has measles?

Solution

Example 3. Incidence of Breast Cancer. [von Savant, pp. 103-104]

In a study, physicians were asked what the odds of breast cancer would be in a woman who was initially thought to have a 1% risk of cancer but who ended up with a positive mammogram result (a mammogram accurately classifies about 80% of cancerous tumors and 90% of benign tumors.) 95 out of a hundred physicians estimated the probability of cancer to be about 75%. Do you agree?

Solution

### References

1. R. B. Ash, Basic Probability Theory, Dover, 2008
2. R. Falk, Understanding Probability and Statistics, A K Peters, 1993
3. J. Havil, Impossible?, Princeton University Press, 2008
4. M. vos Savant, The Power of Logical Thinking, St. Martin's Press, NY 1996  #### Sick Child and Doctor (Solution)

 P(M|R) = P(R|M) P(M) / (P(R|M) P(M) + P(R|F) P(F)) = .95 × .10 / (.95 × .10 + .08 × .90) ≈ 0.57.

Which is nowhere close to 95% of P(R|M). #### Incidence of Breast Cancer (Solution)

Introduce the events:

 P - mammogram result is positive, B - tumor is benign, M - tumor is malignant.

Bayes' formula in this case is

 P(M|P) = P(P|M) P(M) / (P(P|M) P(M) + P(P|B) P(B)) = .80 × .01 / (.80 × .01 + .10 × .99) ≈ 0.075 = 7.5%.

A far cry from a common estimate of 75%. 