# Family Statistics

Let's have a closer look at a problem of family statistics [Falk, p. 45]:

"Do men have more sisters than women?"

An answer may be surmised from a few examples. Take a small family with two siblings: a boy and a girl. The boy has a sister, while the girl does not. In a family with a son and two daughters, the boy has two sisters, while the girls have only one each. A rule seems to emerge: a girl is excluded from sister counting, while boys count all the female siblings there are. From which the conclusion that men should have more sisters than women seems to follow naturally. However, this conclusion is wrong. Men have as many sisters as do women. The argument below seems to me sufficiently convincing to seal the result.

(Once there was an applet.)

So why men have as many sisters as do women? Fix the number of children in a family and consider all possible variants. For example, a family with two children may have two boys, a boy and a girl, or two girls. The heterogeneous variant should be counted twice, because statistically it is twice as likely as either of the homogeneous combinations. (Think of a two coin experimentation.) The number of males' sisters is $0$ for the first variant, is $1$ for the two mixed variants, and $0$ again for the family with two daughters. On the whole, $2$ children families contribute $2$ males' sisters. $2$ children families also contribute $2$ females' sisters that come from the family with two daughters. The average contribution of a $2$ child family is $\displaystyle \frac{1}{2}$ in both cases. For a family with $3$ children we have the following table:

$\begin{array}{c|c|c|c} Combination&\#of\;Combinations&Males' sisters&Females' sisters\\ \hline boy,boy,boy&1&0&0\\ boy,boy,girl&3&2\cdot 1&0\\ boy,girl,girl&3&2&2\cdot 1\\ girl,girl,girl&1&0&3\cdot 2 \end{array}$

On the whole, such families contribute $12$ males' sisters (second and third rows) and $12$ females' sisters (third and fourth rows). The average contribution per family is the same $\displaystyle \frac{12}{8}$ to both counts. In general, too, families with $n$ children contribute the same number, $n(n-1)2^{n-2},$ of males' and females' sisters, the average being $\displaystyle \frac{n(n-1)}{4}$ per family.

Indeed, in a family with $s$ girls and $n-s$ boys, the boys have a total of $s(n-s)$ sisters, while the girls have altogether $s(s-1)$ sisters. Assume that boys and girls come into the world with equal probabilities of $\displaystyle \frac{1}{2}$ and that the birth events are independent. Then there are $2^{n}$ ways a family with n children might have come about. Of this, ${n\choose s}$ - the binomial coefficient "$n$ choose $s$" - is the number of $n$ children families with $s$ daughters. Therefore, the average number of sisters boys from $n$ children families have is given by

$\displaystyle B_{n} = 2^{-n}\sum_{s=0}^n{n\choose s} \cdot s(n-s).$

Similarly, the average number of girls' sisters in such families is

$\displaystyle G_{n} = 2^{-n}\sum_{s=0}^n{n\choose s}\cdot s(s-1).$

Both sums are easily computed with generating functions. Let $B_{n}(x,y) = (x+y)^{n}2^{-n}$ and $G_{n}(x) = (1+x)^{n}2^{-n}.$ Then

$\displaystyle B_{n} = \frac{\partial^{2}}{\partial x\,\partial y}B_{n}(1,1),$

while

$\displaystyle G_{n} = \frac{d^{2}}{dx^{2}}G_{n}(1).$

Which immediately implies $\displaystyle B_{n} = G_{n} = \frac{n(n-1)}{4}$.

(For an expanded discussion, see an additional page.)

### References

1. R. Falk, Understanding Probability and Statistics, A K Peters, 1993

[an error occurred while processing this directive]