Probabilities
... it is not rational for us to believe that the probable is true. 
Lord J. M. Keynes 
According to the definition, probability is a function on the subsets of a sample space. Let's see how it could be defined on the simplest sample space of a single coin toss,
The two element sample space {H, T} has four subsets:
Φ = {}, {H}, {T}, {H, T} = Ω.
To be a probability, a function P defined on this four sets must be nonnegative and not exceeding 1. In addition, on the two fundamental sets Φ and Ω it must take on the prescribed values:
P(Φ) = 0 and P(Ω) = 1.
The values P({H}) and P({T}) which we shall write more concisely as P(H) and P(T) must be somewhere inbetween. P(H) is expected to be the probability of a coin landing heads up; P(T) should be the probability of its landing tails up. This is up to us to assign those probabilities. Intuitively those numbers should be expressing our notion of certainty with which the coin lands one way or the other. Since, for a fair coin, there is no way to prefer one side to the other, the most natural and common way is to make the two probabilities equal:
(1)
P(H) = P(T).
As in real life, the choices we make have consequences. Once we decided that the two probabilities are equal, we are no longer at liberty to choose their common value. The definitions take over and dictate the result. Indeed, the two events {H} and {T} are mutually exclusive so that a probability function should satisfy the additivity requirement:
(2) 

The combination of (1) and (2) leads inevitably to the conclusion that a probability function that models a toss of a fair coin is bound to satisfy
Two events that have equal probabilities are said to be equiprobable. It's a common approach, especially in the introductory probability courses, to define a probability function on a finite sample space by declaring all elementary events equiprobable and building up the function using the additivity requirement. Having a formal definition of probability function avoids the apparent circularity of the construction hinted at elsewhere.
Let's consider the experiment of rolling a die. The sample space consists of 6 possible outcomes
{1, 2, 3, 4, 5, 6}
which, with no indication that the die used is loaded, are declared to be equiprobable. From here, the additivity requirement leads necessarily to:
P(1) = P(2) = P(3) = P(4) = P(5) = P(6) = 1/6.
Since all 6 elementary events  {1}, {2}, {3}, {4}, {5}, {6}  are mutually exclusive, we may readily apply the required additivity, for example:
P({1, 2}) = P({1}) + P({2}) = 1/6 + 1/6 = 1/3
and similarly
P({4, 5, 6}) = P({4}) + P({5}) + P({6}) = 1/6 + 1/6 + 1/6 = 1/2
Note that a 2element event {1, 2} has the probability of
Let X be the random variable associated with the experiment of rolling the dice. The introduction of a random variable allows for naming various sets in a convenient manner, e.g.,:
{1, 2} = {x: x < 3},
and, for the probability, P({1, 2}) = P(X < 3) = 1/3. Similarly,
P({4, 5, 6}) = P(X > 3) = 1/2.
Here are a few additional examples:
P({2, 4, 6}) = P(X is even) = 1/2,
P({1, 2, 4, 5}) = P(X is not divisible by 3) = 2/3,
P({2, 3, 5}) = P(X is prime) = 1/2.
In general, if an event A has m favorable elementary outcomes, the additivity requirement implies
For example, under normal circumstances, drawing a particular card from a deck of 52 cards is assigned a probability of 1/52. Drawing a named (A, K, Q, J) card (of which there are
Later on, we shall have examples of sample spaces where considering the elementary events as equiprobable is unjustified. However, whenever this is possible, the evaluation of probabilities becomes a combinatorial problem that requires finding the total number n of possible outcomes and the number m of the outcomes favorable to the event at hand. It is then natural that properties of combinatorial counting have bearings on the assignment and evaluation of probabilities.
When tossing two distinct (say, first and second) coins there are four possible outcomes
P({H popped up at least once}) = P({HH, HT, TH}) = 3/4,
P(First coin came up heads) = P({HH, HT}) = 2/4 = 1/2,
P(Two outcomes were different) = P({HT, TH}) = 2/4 = 1/2.
We consider tossing two coins as completely independent experiments, the outcome of one having no effect on the outcome of the other. It follows then from the Sequential, or Product, Rule that the size of the sample space of the two experiments is the product of the sizes of the two sample spaces and the same holds of the probabilities. For example,
P({HT}) = 1/4 = 1/2·1/2 = P({H})·P({T}).
More generally, given two sample spaces S_{1} and S_{2} with the number of equiprobable outcomes n_{1} and n_{2} and two events E_{1} (on S_{1}) and E_{2} (on S_{2}) with the number of favorable outcomes m_{1} and m_{2}. Then
P(E_{1}E_{2}) = m_{1}m_{2}/n_{1}n_{2} = m_{1}/n_{1} · m_{2}/n_{2} = P(E_{1})P(E_{2}).
The two coins may be indistinguishable and, when thrown together, may produce only three possible outcomes
P({H, H}) = 1/4,
P({H, T}) = 1/2,
P({T, T}) = 1/4.
Why? This is because the results of the two experiments won't change if we imagine the two coins different, say if we think of them as being blue and red. But, for different coins, the number of elementary events is 4, with two of them  HT and TH  destined to coalesce into one 
When rolling two die, the sample space consists of 36 equiprobable elementary events each with probability 1/36. The possible sums of the two die range from 2 through 12 and the number of favorable events can be observed from the table below:
Using S for the random variable equal to the sum of the two die, the additivity requirement leads to the following probabilities:
P(S = 2) = 1/36,
P(S = 3) = 2/36 = 1/18,
P(S = 4) = 3/36 = 1/12,
P(S = 5) = 4/36 = 1/9,
P(S = 6) = 5/36,
P(S = 7) = 6/36 = 1/6,
P(S = 8) = 5/36,
P(S = 9) = 4/36 = 1/9,
P(S = 10) = 3/36 = 1/12,
P(S = 11) = 2/36 = 1/18,
P(S = 12) = 1/36,
Note that the events are mutually exclusive and exhaustive: their probabilities add up to 1.
(As a curiosity, note that, say, both sums of 4 and 5 come up in two ways, viz.,
Let's return to throwing a coin. With 3 coins, the sample space consists of 8 = 2^{3} possible outcomes. Four 4 die the number grows to
P(HT) = P(H)·P(T) = 1/2 · 1/2 = 1/4.
Continuing in this way, P(HHT) = 1/2·1/2·1/2 = 1/8 is the probability of getting the tails on the third toss;
P(T) + P(HT) + P(HHT) + ...  = 1/2 + 1/4 + 1/8 + ... 
= 1/2·1 / (1  1/2)  
= 1, 
as the sum of a geometric series starting at 1/2 with the factor also of 1/2.
This is a curiosity because there is one event that has been left over: this is the event in which the outcome T never occurs. An infinite number of coin tosses is called for, each with the outcome of heads: HHHH ... Although abstractedly this event is complementary to the possibility of having a tails in a finite number of steps, this event is practically impossible because it requires an infinite number of coin tosses. Deservedly it is assigned the probability of 0.
The probability that tails will show up in four tosses or less equals
P(T) + P(HT) + P(HHT) + P(HHHT)  = 1/2 + 1/4 + 1/8 + 1/16 
= 1/2·(1  1/2^{4})/ (1  1/2). 
More generally, the probability that the tails will show up in at most n tosses equals to the sum
1/2 + 1/4 + 1/8 + ... + 1/2^{n} = 1/2·(1  1/2^{n})/ (1  1/2).
The interpretation of the infinite sum 1/2 + 1/4 + 1/8 + ... is that this is the probability of the tails showing up in a finite number of steps. This probability is 1 so that one should expect to get the tails sooner or later. For this sample space, an event with probability 0 is conceivable but practically impossible. In continuous sample spaces, events with probability 0 are a regular phenomenon and far from being impossible.
Contact Front page Contents Up
Copyright © 19962018 Alexander Bogomolny