Thursday, August 29, 2019

probability - Monty Hall all over again?


Often, when people ask the Monty Hall problem, they omit some important details, such that the problem becomes ambiguous, or they add more details, such that the problem changes drastically. For example, what is the answer for the following problem?


You are participating in a show. There are three doors in front of you. There’s a car behind one of them; there’s a goat behind another; and there’s a sheep behind the third. To the best of your knowledge, every assignment of the three prizes to the three doors is equally probable. You do want the car, badly, and you don’t care about either of the animals at all. You can choose one door, and whatever is behind that door will be awarded to you.


So, without loss of generality, you choose the leftmost door.


After that, to your complete surprise, the host opens the middle door, and you observe a sheep behind it.


The host then offers you a chance to change your choice. Based solely on the information you have, without trying to guess the motives of the host, is it beneficial for you to change your choice?


EDIT: to make the question a little more specific, what is the ideal strategy for you, that maximizes your chances of getting the car in the worst case, no matter what the hosts motives are.



Answer




Here's the formal expression of what we seek. Let $C$ be the event "chose the door with the car" and $R$ be the event "revealed the door with the car". Now, assuming that the door opened is random (but not the one the player chose), we have


$$ Pr(C) = \frac13\\ Pr(R|C) = 0\\ Pr(R|\bar C) = \frac12\\ Pr(C|\bar R) = \frac{Pr(\bar R|C)Pr(C)}{Pr(\bar R|C)Pr(C)+Pr(\bar R|\bar C)Pr(\bar C)} = \frac{1/3}{2/3}=1/2 $$ And therefore the probability of winning by switching or by staying are equal.


This differs from the Monty Hall problem through the value of $Pr(R|\bar C)$, which is zero for the Monty Hall problem. In that case, we have $$ Pr(C|\bar R) = \frac{Pr(\bar R|C)Pr(C)}{Pr(\bar R|C)Pr(C)+Pr(\bar R|\bar C)Pr(\bar C)} = \frac{1/3}{1}=1/3 $$ and thus, switching doubles the chance of winning. This is not true if the host can open the door with the car, where the probability of winning is equal in either direction.


It's tempting to think that the situation is the same (the door that is opened is one without the car) between this situation and the Monty Hall one, but the balance of probabilities leading to the situation is different.


It's a mistake involving the confusion between counting the number of arrangements and determining the actual chance of each event. In the Monty Hall problem, if the player chooses a wrong door, which has a probability of 2/3, the host will necessarily open the other wrong door, and thus the chance of winning by switching is 1. In this problem, if the player chooses a wrong door, there's a 50% chance that the car will be opened. This halves the number of times that the player will choose the wrong door and have an opportunity to switch.


As a result, while the probability rules around the actual situation are the same, the underlying probabilities are shifted - there's only a 1/3 chance that the player will choose a wrong door and the host will open the other wrong door, and a 1/3 chance that the player will choose the right door. In the Monty Hall problem, the probability of the former is 2/3. So even though the situation itself looks the same, the chance of being in that situation while having chosen the wrong door is lower, with the chance of each of them being 1/2 (given that the situation arose), rather than 1/3 and 2/3.


Note that the probability that the player chose the car given that the car wasn't behind the door the host revealed depends on only two parameters. If we express our equation for $Pr(C|\bar R)$ slightly differently, we have $$ Pr(C|\bar R) = \frac{Pr(C)}{Pr(C)+\frac{1-Pr(R|C)}{1-Pr(R|\bar C)}(1-Pr(C))} = \frac{A}{A+B(1-A)} $$ where $A=Pr(C)$ and $B=\frac{1-Pr(R|C)}{1-Pr(R|\bar C)}$ - as such, these are the only two values that are relevant. In the Monty Hall case, $A=1/3$ and $B=2$. In our case, $A=1/3$ and $B=1$. If the host is incredibly malicious, almost certainly set to reveal the car if the player doesn't choose it, then $A=1/3$ and $B\approx 0$, meaning that the player is practically guaranteed to win by staying.


If we were to allow the host to open other doors, then we have to handle those distinctly. As such, it is better to consider $\hat R$, which is the event "the host opened one of the other two doors, and it contained an animal" (or specifically "the sheep" - it's equivalent). For the three-door-only case, $\hat R \equiv \bar R$. So the general expression would be $$ Pr((C|\hat R) = \frac{Pr(C)}{Pr(C)+\frac{Pr(\hat R|C)}{Pr(\hat R|\bar C)}(1-Pr(C))} = \frac{A}{A+B(1-A)} $$ where $A=Pr(C)$ again and $B=\frac{Pr(\hat R|C)}{Pr(\hat R|\bar C)}$. In this expression, it is easily seen that adding extra doors that aren't among the three we care about doesn't change the result. With an extra door, we have $Pr(\hat R|C) = 2/3$ and $Pr(\hat R|\bar C) = 1/3$, and we again have $B=2$.


Note - the assumption was described above as "assuming that the door opened is random (but not the one the player chose)". This is for the purposes of the calculations. The true assumption is actually that the host could have any possible motive with equal probability, and looks at the expected probability. It just happens that the two assumptions are equivalent.


Update: As I noted, taking the host's motive as "random" is the same as randomly choosing the host's motive. It's worth explaining properly how this works, and explain an easy pitfall in determining it.



Suppose that $X$ is a uniformly-distributed random variable (between 0 and 1) representing the probability that the host reveals the car given that the player didn't choose the car's door. That is, $$ Pr(R|\bar C\land X=x)=x $$ Because $C$ and the value of $X$ are independent of each other, we can deal with these separately - that is, we can treat it as $(R|\bar C)|X=x$ or $(R|X=x)|\bar C$, and nothing will be wrong.


And here's where the pitfall comes in. It can be tempting to calculate $$ Pr(C|\bar R\land X=x) = \frac{Pr(\bar R|C)Pr(C)}{Pr(\bar R|C)Pr(C)+Pr(\bar R|\bar C\land X=x)Pr(\bar C)} = \frac{Pr(\bar R|C)Pr(C)}{Pr(\bar R|C)Pr(C)+(1-x)Pr(\bar C)} $$ and then integrate from there. However, there's a problem - $\bar R$ and $X$ are not independent, and thus we cannot treat $C|\bar R\land X=x$ as $(C|\bar R)|X=x$, which would be necessary to determine $C|\bar R$ by integration.


Instead, we need to calculate it through the full process. That is, we must get $$ Pr(C|\bar R) = \frac{Pr(\bar R|C)Pr(C)}{Pr(\bar R)} $$ and to calculate the denominator, we must integrate. That is, $$ Pr(\bar R) = \int_0^1 Pr(\bar R|X=x) dx $$ where $$ Pr(\bar R|X=x) = Pr(\bar R|C)Pr(C) + Pr(\bar R|\bar C\land X=x)Pr(\bar C)\\ = Pr(\bar R|C)Pr(C) + (1-x)(1-Pr(C)) $$ and this quickly gives us $$ Pr(\bar R) = Pr(\bar R|C)Pr(C) + \frac12(1-Pr(C)) $$ at which point the equivalence with assuming the host chooses at random becomes clear.


No comments:

Post a Comment

classical mechanics - Moment of a force about a given axis (Torque) - Scalar or vectorial?

I am studying Statics and saw that: The moment of a force about a given axis (or Torque) is defined by the equation: $M_X = (\vec r \times \...