100 Prisoners and a Light Bulb

Some time ago, Ilia Denotkine has posted the following problem on the old CTK Exchange:

There are 100 prisoners in solitary cells. There's a central living room with one light bulb; this bulb is initially off. No prisoner can see the light bulb from his or her own cell. Everyday, the warden picks a prisoner equally at random, and that prisoner visits the living room. While there, the prisoner can toggle the bulb if he or she wishes. Also, the prisoner has the option of asserting that all 100 prisoners have been to the living room by now. If this assertion is false, all 100 prisoners are shot. However, if it is indeed true, all prisoners are set free and inducted into MENSA, since the world could always use more smart people. Thus, the assertion should only be made if the prisoner is 100% certain of its validity. The prisoners are allowed to get together one night in the courtyard, to discuss a plan. What plan should they agree on, so that eventually, someone will make a correct assertion?

He then added a background to his question:

I have seen this problem on the forums, and here are some of the best solutions (in my opinion):

At the beginning, the prisoners select a leader. Whenever a person (with the exception of the leader) comes into a room, he turns the lights on (but he does this only once). If the lights are already on, he does nothing. When the leader goes into the room, he turns off the lights. When he will have turned off the lights 99 times, he is 100% sure that everyone has been in the room.
wait 3 years, and with a great probability say that everyone has been in the room.

Does anyone know The optimal solution???

I have taken this problem from the www.ocf.berkeley.edu site, but I believe that you can find it on many others.

As I had a recollection of seeing this problem in [Winkler], I replied

The problem is indeed popular. It's even included in P. Winkler's Mathematical Puzzles, which is a recommended book in any event. Winkler also lists a slew of sources where the problem appeared, including ibm.com and a newsletter of the MSRI.

The solution is this:

The prisons select a fellow, say Alice, who will have a special responsibility. All other prisoners behave according to the same protocol: each turns the light off twice, i.e. they turn it off the first two times they find it on. They leave it untouched thereafter. Alice turns the light on if it was off and, additionally, counts the number of times she entered the room with the light off. When her count reaches 2n - 3 she may claim with certainty that all n prisoners have been to the room.

As it happened, I was wrong. This may be immediately surmised from Stuart Anderson's response. In my wonderment I contacted Peter Winkler who kindly set things straight for me. The formulation in his book is somewhat different, but this difference proves to be of major significance:

Each of n prisoners will be sent alone into a certain room, infinitely often, but in some arbitrary order determined by their jailer. The prisoners have a chance to confer in advance, but once the visits begin, their only means of communication will be via a light in the room which they can turn on or off. Help them design a protocol which will ensure that some prisoner will eventually be able to deduce that everyone has visited the room.

(There is another approach.)

References

P. Winkler, Mathematical Puzzles: A Connoisseur's Collection, A K Peters, 2004, pp. 109-111

Solution by Stuart Anderson

This would work of course, but is it optimal? For instance, this would also work, I think:

Alice counts the times she finds the light on, and ensures that it is always off when she leaves the room. Everyone else turns on the light the first time they find it off, and then never touches it again. This way, between visits of Alice, at most one prisoner will turn on the light, and no prisoner turns it on more than once. Therefore the number of times Alice finds the light on is no more than the number of different prisoners that have entered the room. Each prisoner knows he has been counted once he has turned the light on, since he is the only one who touched the switch since Alice last visited. When Alice counts to n-1, she knows everyone has visited the room.

What does optimal mean here? It could only reasonably mean that the prisoners are freed in the shortest time. So what is the expected time they must wait until Alice has counted to n-1? This is a rather elaborate calculation in probability, so the prisoners turn to the actuary (who is in prison for embezzlement) for some answers.

He explains that using Bayes theorem,

P(X|Y)·P(Y) = P(X&Y) = P(Y|X)·P(X)

and the linearity of expected value,

E(X|Y)·P(Y) + E(X|~Y)·P(~Y) = E(X)

you can calculate the expected time in prison like this:

Suppose Alice has just visited the room, and let K be the number of days that pass before her next visit (so she visits again K+1 days from now), let n be the number of prisoners, let c be the number of times she has found the light on so far, and let P(ON) and P(OFF) be the probabilities that she finds the light on or off on her next visit. Then E(K) = n - 1, P(K=k) = 1/n·((n-1)/n)^k, P(K = k & OFF) = 1/n·(c/n)^k, which are fairly obvious.

Summing the last formula over all k gives P(OFF) = 1/(n-c). Bayes theorem then gives P(K = k|OFF) = (1-c/n)·(c/n)^k, and from this you can calculate E(K|OFF) = c/(n-c) and linearity gives

E(K|ON) = ((n-1)(n-c)-c/(n-c))/(n-c-1).

Now let m be the number of times Alice visits and L be the number of days that pass before she next finds it on. Each time she finds it is off, c does not change, so all the calculations regarding the time until her next visit also do not change.

Therefore, the expected number of days until she next finds the light on is found by summing over all possible m to get the expected total time wasted on visits where the light is off, plus the expected time for the one visit where it was on. This gives

E(L) = (1+E(K|ON))P(ON) + sum(m(1+E(K|OFF))P(OFF)^m

= n(1/(n-c-1) - 1/(n-c) + 1 - 1/(n-c)²).

Now we know how long we expect to wait from count = c to count = c+1. Therefore, we must sum this up from c=0 to c=n-2 to find the total expected time E(T). The result is E(T) = n² - n/(n-1) - a, where a = S(1/c²) from 2 to n. Putting n=100 into this gives 9935.5 days, which is 26.2 years.

But (continues the actuary) this is absurdly long to wait. Simple probability shows that we can be almost certain much sooner than this. The probability that on day d the count is c is P(c, d), which is obviously equal to P(c-1,d-1)·(1-(c-1)/n) + P(c, d-1)·(c/n). Of course, P(0, 0) = P(1, 1) = 1 and P(1,0) = 0, so we can recursively calculate the probability P(n, d). It turns out that P(100,1146) = 0.999, and P(100,1375) = 0.9999, P(100,1604) = 0.99999, and P(1833) = 0.999999. That means that in 3.14 years, we have a less than 1/1000 chance of failing, and in exactly 5 years and a week, we have less than one in a million chances of failing. I say we should wait 5 years and then say "let us out, we've all seen the light."

As they are about to kill Alice (who was already a member of Mensa) for coming up with a crazy plan to keep them in prison for 26 years, the game theorist (who is in prison for insider trading on the stock market) steps in to point out that this is a losing move. If they kill her now she will never go into the room, and the warden will keep them here forever.

In the happy ending, they let Alice live, and they all get out of prison in 5 years. Strangely, they all decline to join Mensa, preferring to enter actuarial training.

71752901

E(L)	= (1+E(K\|ON))P(ON) + sum(m(1+E(K\|OFF))P(OFF)^m
	= n(1/(n-c-1) - 1/(n-c) + 1 - 1/(n-c)²).