# What is Probability?

'Probability is the bane of the age,' said Moreland, now warming up. 'Every Tom, Dick, and Harry thinks he knows what is probable. The fact is most people have not the smallest idea what is going on round them. Their conclusions about life are based on utterly irrelevant - and usually inaccurate - premises.' |

Anthony Powell |

The very name calculus of probabilities is a paradox. Probability opposed to certainty is what we do not know, and how can we calculate what we do not know? |

H. Poincaré |

**Intuitively**, the mathematical theory of probability deals with patterns that occur in random events. For the theory of probability the nature of randomness is inessential. (Note for the record, that according to the 18^{th} century French mathematician Marquis de Laplace randomness is a perceived phenomenon explained by human ignorance, while the late 20^{th} century mathematics came with a realization that chaos may emerge as the result of deterministic processes.) An *experiment* is a process - natural or set up deliberately - that has an observable outcome. In the deliberate setting, the word *experiment* and *trial* are synonymous. An experiment has a *random* outcome if the result of the experiment can't be predicted with absolute certainty. An *event* is a collection of possible outcomes of an experiment. An event is said to occur as a result of an experiment if it contains the actual outcome of that experiment. Individual outcomes comprising an event are said to be *favorable* to that event. Events are assigned a measure of certainty which is called *probability* (of an event.)

Quite often the word *experiment* describes an experimental setup, while the word *trial* applies to actually executing the experiment and obtaining an outcome.

A formal theory of probability has been developed in the 1930s by the Russian mathematician A. N. Kolmogorov.

The starting point is the *sample (or probability) space* - a set of all possible outcomes. Let's call it Ω. For the set Ω, a *probability* is a real-valued function P defined on the subsets of Ω:

(*)

P: 2^{Ω} → [0, 1]

Thus we require that the function be non-negative and that its values never exceed 1. The subsets of Ω for which P is defined are called *event*s. *elementary*. The function is required to be defined on the empty subset Φ and the whole set Ω:

(1)

P(Φ) = 0, P(Ω) = 1.

This says in particular that both Φ and Ω are events. The event Φ that never happens is *impossible* and has probability 0. The event Ω has probability 1 and is *certain* or *necessary*. In general, P(A) is the probability of event A; "A takes place or occurs with probability P(A)."

If Ω is a finite set then usually the notions of an impossible event and an event with probability 0 coincide, although it may not be so. If Ω is infinite then the two notions practically never coincide. A similar dichotomy exists for the notions of a certain event and that with probability 1. Examples will be given shortly.

The union of two events A and B, which is denoted A∪B, is the event that takes place whenever one of A or B does. The intersection of two events A and B, which is denoted A∩B, is the event that takes place only when both of A and B do. Both notions could be extended to any number of events.

Naturally, one would expect

The probability function (or, as it most commonly called, *measure*) is required to be *additive*: for two *disjoint* events A and B, i.e. whenever

(2)

P(A∪B) = P(A) + P(B),

which is a consequence of a seemingly more general rule: for any two events A and B, their union A∪B and intersection A∩B are events and

(2')

P(A∪B) = P(A) + P(B) - P(A ∩ B).

Note, however, that (2') can be derived from (2). Indeed, assuming that all the sets involved are events, events

P(A) + P(B) | = (P(A - B) + P(A ∩ B)) + (P(B - A) + P(A ∩ B)) |

= (P(A - B) + P(A ∩ B) + (P(B - A)) + P(A ∩ B) | |

= P(A∪B) + P(A ∩ B) |

which is exactly (2').

For an infinite space Ω, we require σ-*additivity*: for mutually disjoint sets A_{i},

(3)

P(∪_{i≥1}A_{i}) = ∑_{i≥1}P(A_{i}).

In general, the collection of events is assumed to be a σ-*algebra*, which means that the complements of events are events and so are the countable unions and intersections.

Some properties of P follow immediately from the definition. For example, denote the complement of an event A, *complementary event*. Then, naturally, A and A are disjoint,

1 = P(Ω) = P(A) + P(A).

In other words,

(4)

P(A) = 1 - P(A).

Also from (2) and (*), if B = A∪C for disjoint A and C, then

P(B) = P(A∪C) = P(A) + P(C) ≥ P(A).

In other words, if A is a subset of B, AB, then

(5)

P(B) ≥ P(A).

Probability is a *monotone* function - the fact that jibes with our intuition that a larger event, i.e. an event with a greater number of favorable outcomes, is more likely to occur than a smaller event.

Disjoint events do not share favorable outcomes and, for this reason, are often called *incompatible* or, even more assertively, *mutually exclusive*.

In addition to a sample space, an experiment may be profitably associated with what is known as a *random variable*. X is a *random variable* if its values are not determined with certainty but come from a sample space defined by a random experiment. If x is a possible outcome of an experiment, we often write P(x) for the probability P({x}) of the elementary event {x}. In terms of random variable X, the same quantity is described as

### References

- R. B. Ash,
*Basic Probability Theory*, Dover, 2008 - A. N. Kolmogorov,
__The Theory of Probability__, in*Mathematics: Its Content, Methods and Meaning*, Dover, 1999 - P. S. de Laplace,
__Concerning Probability__, in*The World of Mathematics*, Dover, 2003 - P. S. de Laplace,
__A Philosophical Essay on Probability__, in*God Created the Integers: The Mathematical Breakthroughs That Changed History*, Running Press, 2007

|Contact| |Front page| |Contents| |Inventor's Paradox| |Up|

Copyright © 1996-2018 Alexander Bogomolny