What is Probability?

'Probability is the bane of the age,' said Moreland, now warming up. 'Every Tom, Dick, and Harry thinks he knows what is probable. The fact is most people have not the smallest idea what is going on round them. Their conclusions about life are based on utterly irrelevant - and usually inaccurate - premises.'

Anthony Powell
Casanova's Chinese Restaurant
2^nd Movement in A Dance to the Music of Time
University of Chicago Press, 1995

The very name calculus of probabilities is a paradox. Probability opposed to certainty is what we do not know, and how can we calculate what we do not know?

H. Poincaré
Science and Hypothesis
Cosimo Classics, 2007, Chapter XI

Intuitively, the mathematical theory of probability deals with patterns that occur in random events. For the theory of probability the nature of randomness is inessential. (Note for the record, that according to the 18^th century French mathematician Marquis de Laplace randomness is a perceived phenomenon explained by human ignorance, while the late 20^th century mathematics came with a realization that chaos may emerge as the result of deterministic processes.) An experiment is a process - natural or set up deliberately - that has an observable outcome. In the deliberate setting, the word experiment and trial are synonymous. An experiment has a random outcome if the result of the experiment can't be predicted with absolute certainty. An event is a collection of possible outcomes of an experiment. An event is said to occur as a result of an experiment if it contains the actual outcome of that experiment. Individual outcomes comprising an event are said to be favorable to that event. Events are assigned a measure of certainty which is called probability (of an event.)

Quite often the word experiment describes an experimental setup, while the word trial applies to actually executing the experiment and obtaining an outcome.

A formal theory of probability has been developed in the 1930s by the Russian mathematician A. N. Kolmogorov.

The starting point is the sample (or probability) space - a set of all possible outcomes. Let's call it Ω. For the set Ω, a probability is a real-valued function P defined on the subsets of Ω:

(*)

P: 2^Ω → [0, 1]

Thus we require that the function be non-negative and that its values never exceed 1. The subsets of Ω for which P is defined are called events. 1-element events are said to be elementary. The function is required to be defined on the empty subset Φ and the whole set Ω:

(1)

P(Φ) = 0, P(Ω) = 1.

This says in particular that both Φ and Ω are events. The event Φ that never happens is impossible and has probability 0. The event Ω has probability 1 and is certain or necessary. In general, P(A) is the probability of event A; "A takes place or occurs with probability P(A)."

If Ω is a finite set then usually the notions of an impossible event and an event with probability 0 coincide, although it may not be so. If Ω is infinite then the two notions practically never coincide. A similar dichotomy exists for the notions of a certain event and that with probability 1. Examples will be given shortly.

The union of two events A and B, which is denoted A∪B, is the event that takes place whenever one of A or B does. The intersection of two events A and B, which is denoted A∩B, is the event that takes place only when both of A and B do. Both notions could be extended to any number of events.

Naturally, one would expect P(A∪B) ≥ max{P(A), P(B)}, similarly, P(A∩B) ≤ min{P(A), P(B)}. However, more stringent requirements are usually enforced.

The probability function (or, as it most commonly called, measure) is required to be additive: for two disjoint events A and B, i.e. whenever A ∩ B = Φ,

(2)

P(A∪B) = P(A) + P(B),

which is a consequence of a seemingly more general rule: for any two events A and B, their union A∪B and intersection A∩B are events and

(2')

P(A∪B) = P(A) + P(B) - P(A ∩ B).

Note, however, that (2') can be derived from (2). Indeed, assuming that all the sets involved are events, events A - B and A ∩ B are disjoint as are B - A and A ∩ B. In fact, all three events A - B, B - A, and A ∩ B are disjoint and the union of the three is exactly A∪B. We have,

P(A) + P(B) = (P(A - B) + P(A ∩ B)) + (P(B - A) + P(A ∩ B))

= (P(A - B) + P(A ∩ B) + (P(B - A)) + P(A ∩ B)

= P(A∪B) + P(A ∩ B)

which is exactly (2').

For an infinite space Ω, we require σ-additivity: for mutually disjoint sets A_i, i = 1, 2, ...,, their union is an event and

(3)

P(∪_i≥1A_i) = ∑_i≥1P(A_i).

In general, the collection of events is assumed to be a σ-algebra, which means that the complements of events are events and so are the countable unions and intersections.

Some properties of P follow immediately from the definition. For example, denote the complement of an event A, A = Ω - A. It is also called a complementary event. Then, naturally, A and A are disjoint, A ∩ A = Φ, and from (2),

1 = P(Ω) = P(A) + P(A).

In other words,

(4)

P(A) = 1 - P(A).

Also from (2) and (*), if B = A∪C for disjoint A and C, then

P(B) = P(A∪C) = P(A) + P(C) ≥ P(A).

In other words, if A is a subset of B, AB, then

(5)

P(B) ≥ P(A).

Probability is a monotone function - the fact that jibes with our intuition that a larger event, i.e. an event with a greater number of favorable outcomes, is more likely to occur than a smaller event.

Disjoint events do not share favorable outcomes and, for this reason, are often called incompatible or, even more assertively, mutually exclusive.

In addition to a sample space, an experiment may be profitably associated with what is known as a random variable. X is a random variable if its values are not determined with certainty but come from a sample space defined by a random experiment. If x is a possible outcome of an experiment, we often write P(x) for the probability P({x}) of the elementary event {x}. In terms of random variable X, the same quantity is described as P(X = x), the probability that the random variable X takes the value of x.

References

R. B. Ash, Basic Probability Theory, Dover, 2008
A. N. Kolmogorov, The Theory of Probability, in Mathematics: Its Content, Methods and Meaning, Dover, 1999
P. S. de Laplace, Concerning Probability, in The World of Mathematics, Dover, 2003
P. S. de Laplace, A Philosophical Essay on Probability, in God Created the Integers: The Mathematical Breakthroughs That Changed History , Running Press, 2007

71752776

P(A) + P(B)	= (P(A - B) + P(A ∩ B)) + (P(B - A) + P(A ∩ B))
	= (P(A - B) + P(A ∩ B) + (P(B - A)) + P(A ∩ B)
	= P(A∪B) + P(A ∩ B)