I just finished the book “The Monty Hall Problem” by Jason Rosenhouse, which is an exploration of one of the most counter intuitive puzzles in probability. The entire book is devoted to the basic form of the problem and a number of variations, of increasing complexity.
The basic outline of the problem is as follows. You are on a game show and are presented with three closed doors. Behind one of the doors is a prize (say a car) and behind the other two doors is nothing. You are asked to select a door. Once you have picked one of the three doors, the game show host (Monty) opens one of the other doors to reveal that it is empty. Your choice is then to either stick with the door you have initially selected or to change your selection to the other unopened door.
So for example, say you select door 1 below and Monty opens door 2. He then offers you the choice to switch to door 3 or stay with your original selection door 1. You have a choice of two unopened doors. Does it make any difference if you switch or stay?

Most people, myself included, will intuitively say no – there are two doors, and the prize has an equal probability of being behind either of the two remaining doors, so the probability is
We can prove this very quickly using a simulated approach. If we define the rules of the game as follows:
- The prize can be behind any door with equal initial probability;
- Monty will never open the door containing the prize;
- When Monty has a choice of 2 (empty) doors to open, he will randomly choose between them.
Here is a simple R function that we can use to execute a single run of the game. It simulates the simple game given the rules above, and returns a string describing the winning strategy for that game. For example it will return “Stick” if the winning strategy for that game is to stay with the door you initially selected.
classic_monty <- function() { # Assign the prize prize <- sample(1:3,1) # Pick a door choice <- sample(1:3,1) # Monty picks a door monty <- sample(c(1:3)[-c(choice,prize)], 1) return(ifelse(prize!=choice, "Switch", "Stick")) }
We can run the game a few times and see the result:
> classic_monty() [1] "Switch" > classic_monty() [1] "Stick" > classic_monty() [1] "Switch"
To see the asymptotic win probability of switch or stick, we can replicate the experiment a number of times and record the outcome:
n <- 2^(1:16) runs <- data.frame(n=numeric(), switch=numeric()) for (trials in n) { run <- table(replicate(trials, classic_monty())) runs <- runs %>% add_row(n=trials, switch=(sum(run["Switch"]))/trials) } # Handle zero-occurrence trials runs[is.na(runs)]<-0
If we run this, then we can examine the win probability using the switch strategy as we increase the number of trials:
> runs n switch 1 2 0.0000000 2 4 0.2500000 3 8 0.6250000 4 16 0.5000000 5 32 0.5937500 6 64 0.7031250 7 128 0.6640625 8 256 0.6835938 9 512 0.6484375 10 1024 0.6640625 11 2048 0.6748047 12 4096 0.6645508 13 8192 0.6702881 14 16384 0.6721191 15 32768 0.6636963 16 65536 0.6641541
There is a clear convergence to a
In order to understand the mechanics behind this result a little more, we can generate the full space of possible outcomes
The code to generate the sample space is below:
# Generate sample space of tuples for Monty-Hall space <- data.frame(choice=numeric(), monty=numeric(), prize=numeric()) i <- 1 for (prize in 1:3) { for (choice in 1:3) { for (monty in c(1:3)[-c(prize,choice)]) { space <- space %>% add_row(choice=choice,monty=monty,prize=prize) } } } space <- space %>% arrange(choice)
Which will generate a table as follows:
> head(space) choice monty prize 1 1 2 1 2 1 3 1 3 1 3 2 4 1 2 3 5 2 3 1 6 2 1 2
Let’s look at a sample play. Say we choose door 1. The sample space for possible plays where we choose door 1 is:
1 1 2 1 2 1 3 1 3 1 3 2 4 1 2 3
So if we stick with door 1 it looks like we have 2 outcomes out of 4, or a 50% chance of winning. However this is not the case, as the first two rows in the sample space above must have the same probability as each of the last two rows – door 1 has a fixed probability of
Once we see the sample space enumerated like this, the reason for the switch probability seems obvious – due to the rules of the game outlined above, we have constrained the sample space. If the host follows the rules above and randomly chooses between two unopened doors when we have selected the door with the prize, but in all other cases will only open the door that is empty, we can see that the choice of door opened by Monty contains information – which is why his choice of door is important.
The Bayesian Approach
As an aside, the book contains a short conditional probability-based approach to the simple scenario above that I think is worth showing. If
This is
We can infer that if any door is initially likely to contain the prize, then
And as Monty will not open the door hiding the prize
This then simplifies to
We can figure out
Thus
Another nice approach. This is a nice problem to illustrate just how deceptive simple probability puzzles can be!