The Monty Hall Problem

probability

simulation

puzzle

Published

September 13, 2023

The Puzzle

The Monty Hall problem is a famous puzzle with a paradoxical solution. It loosely follows part of the game show Let’s Make a Deal, wherein a contestant is presented with three doors. Behind one door is a car, and behind the other two are goats. If the contestant chooses the door with the car, they get to keep it. If they choose a door with a goat, they win nothing. After making an initial choice, the host (Monty Hall) opens one of the other two doors, revealing a goat. The puzzle is: should the contestant stick with their original choice of doors, or should they switch?

The Solution

Before the host opens the door revealing a goat, each door has a 1/3 chance of having the car. It may seem that after the host reveals the door with the goat, each of the remaining doors has a 1/2 chance of having the car. If that’s the case, then each door has equal probability of winning and so it doesn’t make sense to switch.

However, the real answer is that they should switch, doubling their chances of winning if they do. When the contestant picks their door, they have a 1 in 3 chance of winning. It follows that the other two doors have a 2 in 3 chance of winning. Those facts do not change, regardless of what information the host reveals. If the contestant picks door 1, then there is always a 2/3 chance that the car is behind either door 2 or door 3. After the host reveals the goat, there is still only a 1/3 chance that door 1 has the car, and there is still a 2/3 chance that either door 2 or door 3 has the car. But we know that the door with the goat cannot have the car, and so the other remaining door has a 2/3 chance of having the car.

We can express this mathematically. Let \(D_i\) be the probability that door \(i\) has the car. Then:

\[ P(D_1) = P(D_2) = P(D_3) = 1/3 \] It likewise follows that

\[ P(D_2 \cup D_3) = P(D_2) + P(D_3) = 2/3 \] That is, the probability that the car is behind either door 2 or door 3 is the sum of their individual probabilities, which add up to 2/3.

Let’s say that we learn that door 2 has the goat. The probability that door 2 has the car is now 0. Then:

\[ P(D_2) = 0 \] \[ P(D_2 \cup D_3) = P(D_2) + P(D_3) = 2/3 \] \[ 0 + P(D_3) = 2/3 \] \[ P(D_3) = 2/3 \] If the probability that door 1, which the contestant chose, is only 1/3 and the probability that door 3 is 2/3, then it makes the most sense for the contestant to change their selection.

Bayes’ Rule Solution

We can also frame this puzzle as a Bayes’ Rule problem, where we can determine the posterior probability of door 1 winning given the information that door 2 has the goat. Recall that Bayes’ Rule is:

\[ P(B|A) = \frac{P(A|B)P(B)}{P(A|B)P(B) + P(A|\neg B)P(\neg B)} \] Let \(A\) be the event of Monty revealing a door with a goat behind it. Then \(B\) is the event of choosing the correct door. \(P(A|B)\) is the probability of Monty revealing a door with a goat behind it given that you picked the right door. \(P(B|A)\) is the posterior probability that you picked the right door, given that Monty revealed a goat. We know that \(P(A)\), the probability that Monty reveals a goat, is 1. Monty always reveals a goat. So \(P(A|B)\) is also 1 – Monty always reveals a goat, no matter what door you choose. Similarly, \(P(A|\neg B) = 1\), since, again, Monty reveals a goat regardless of whether your choice was correct. \(P(A|B) = P(A) = 1\). The prior probability of choosing the correct door, absent any information from Monty, is \(P(B) = 1/3\), and so \(P(\neg B) = 1 - P(B) = 2/3\). Now we can simply plug into the equation.

\[ P(B|A) = \frac{1 \times 1/3}{(1 \times 1/3) + (1 \times 2/3)} = \frac{1/3}{1} = 1/3 \] So the posterior probability of picking the car is 1/3, the same as the prior probability. But now there is only one other door left, and \(P(\neg B) = 2/3\) That one other door therefore has a 2/3 chance of winning, so you should switch.

Simulation

We can also run a simulation to demonstrate this. We will use n = 10000 repetitions to get good convergence to the mean.

set.seed(2023)
n <- 1e5
no_switch <- replicate(n = n, {
1  doors <- 1:3
2  car <- sample(doors, 1)
3  goats <- doors[doors != car]
4  choice <- sample(doors, 1)
5  not_chosen <- doors[doors != choice]
6  inx <- intersect(goats, not_chosen)
7  if (length(inx) == 1) {
    reveal <- inx
  } else {
    reveal <- sample(inx, 1)
  }
8  remaining_door <- doors[!(doors %in% c(choice, reveal))]
9  car == choice
})

1: Enumerate the door options
2: Randomly determine which door has the car
3: Enumerate the choices that have goats
4: Pick a door at random
5: Enumerate the doors that you did not pick
6: Figure out the overlap between the goats and the doors that you did not pick. Those are the possibilities for Monty to reveal
7: This is a little tricky. If the intersect of goats and doors not chosen is only one item long, then you just pick that one. Otherwise, take a random sample from that intersection. We can’t just use sample() without looking at the length of the inx vector because sample() with a single integer for the x argument treats it as if you’re sampling from 1 to x, rather than just out of x choices. So if the the car was behind 3, goats are behind 1 and 2. If your choice was 1, then the doors not chosen are 2 and 3. The intersect between goats and doors not chosen is just 2. If you try to do sample(2, 1), R will treat it as if you did sample(1:2, 1), which is clearly not what we’re looking for. Thus, you have to check if inx is only one element long.
8: Determine which door remains after the reveal
9: Check if the door you chose initially actually had the car

mean(no_switch)

[1] 0.3354

Note that remaining_door is never actually used. It’s not relevant. Which means that if you stick with the door you’ve chosen initially, the reveal is not actually relevant. It doesn’t factor into your decision, so you might as well not have it at all. Knowing that also makes the idea that the prior probability of winning is the same as the posterior probability of winning given a reveal make more sense. If you never change which door you pick, then the reveal doesn’t matter.

We can repeat this for the scenario of switching doors. It will be essentially the same except that just before the end we will select a new door based on the options available after the reveal.

switch_door <- replicate(n = n, {
  doors <- 1:3
  car <- sample(doors, 1)
  goats <- doors[doors != car]
  choice <- sample(doors, 1)
  not_chosen <- doors[doors != choice]
  inx <- intersect(goats, not_chosen)
  if (length(inx) == 1) {
    reveal <- inx
  } else {
    reveal <- sample(inx, 1)
  }
  remaining_door <- doors[!(doors %in% c(choice, reveal))]
  car == remaining_door
})

mean(switch_door)

[1] 0.66667

So that’s it. If you’re shown a goat, switch doors. You’ll be twice as likely to win a car.