Kevin's Math Page # Analysis of the Higher Number Game Show Kevin Gong March 18th, 2016

The March 4th, 2016 Riddler column described a simple game in which the higher number wins. To summarize:

 Two players compete, each pressing a button that produces a random number between 0 and 1 (numbers are chosen from a standard uniform distribution). The player can keep that number, or press the button again to get a different (also random) number. Neither player can see each other. They come out and compare numbers, and the higher number wins.

So what's the best strategy to use? Put mathematically, what number C, the cutoff, should they choose such that any number above that they keep, and any number below that they press the button again? (To simplify our analysis, we will basically ignore the case of the number being equal to C, since it is a uniform distribution from a set of real numbers, so the probability of equality is basically 0.

So, let's compare some strategies. let's call the first player A and the second player B. Let a = the cutoff for player A, and b = the cutoff for player B. First, let's suppose a = b = 0. This means both players always press the button just once and keep the number they get. Each player's range of numbers is 0 to 1 (spread uniformly), so clearly each has a 50% chance of winning.

Now let's suppose a = 0, but B chooses the obvious value b = 0.5. How often will player B win? There are two cases:

Case 1: Player B only presses the button once (he gets a number greater than 0.5 on the first try). This happens 50% of the time, or with frequency 0.5. How do we calculate his odds of winning in this case? Let's split this into two sub-cases, based on the value that A gets:

Case 1A: Player A gets a number less than 0.5. This happens with frequency 0.5. In this case, Player B always wins.

Case 1B: Player A gets a number greater than 0.5. This happens with frequency 0.5. In this case, both players have a number between 0.5 and 1 (with standard uniform distribution), so B wins 50% of the time, 0.5.

Put together, player B's expected value for case 1 = 0.5 * (0.5 * 1 + 0.5 * 0.5) = (1/2) * (3/4) = 3/8

Case 2: Player B presses the button twice. This happens with frequency 0.5. He presses the button to get a second number, which is a random number between 0 and 1. Since that's the same as A, B's expected winning frequency is 0.5. So player B's expected value for case 2 = 0.5 * 0.5 = 1/4

Total expected value for B = 3/8 + 1/4 = 5/8 = 62.5%

So, choosing C = 0.5 is pretty good. But is it the best? Is there a strategy that would beat a player choosing C = 0.5? To answer that, let's answer the more general question: for a given a and b, what is the expected value for player B's winning percentage?

To answer that question, let's split it up into 4 basic cases based on how many times each player presses the button. Each player presses the button once or twice, leading to two cases for each player, or four total cases:

Case 1: Let's start with the easiest case of both players pressing the button twice. This happens if each gets a number lower than their cutoff for the first press. So the frequency of this happening is a * b. (For example, if a = 1/4, the chance of getting a number lower than 1/4 is 25% (1/4). In this case each player gets a random number from 0 to 1, so B's winning frequency is 0.5. Put together, B's expected value for this case = a * b / 2.

Case 2: In this case, A presses the button twice and B presses it once. The frequency of this happening is a * (1 - b). Let's split this into two subcases:

Case 2A: A's second number is less than b. This happens with frequency b, and in this case player B always wins.

Case 2B: A's second number is greater than b. This happens with frequency (1 - b). Since B only pressed the button once, that mean's B's number is greater than b as well, so B's win frequency is 1/2 in this case.

Case 2 Summary: Put it all together, and B's expected value for this case
= a * (1 - b) * b * 1 + a * (1 - b) * (1 - b) / 2.

Case 3: In this case, A presses the button once and B presses it twice. The frequency of this happening is (1 - a) * b. Let's again split this into two subcases:

Case 3A: B's second number is less than a. This happens with frequency a, and B always loses in this case.

Case 3B: B's second number is greater than a. This happens with frequency (1 - a). Since A only pressed the button once, that mean's A's number is greater than a as well, so B's win frequency is 1/2 in this case.

Case 3 Summary: Put it all together, and B's expected value for this case = (1 - a) * b * (1 - a) / 2

Case 4: In this case, both A and B press the button once. This happens with frequency (1 - a) * (1 - b). Let's look at two subcases, each with two sub-subcases.

Case 4A: In this case, a > b. There are then two sub-subcases.

Case 4A1: B's second number is greater than a. This case happens with frequency (1 - a) / (1 - b). B and A both end up with a number greater than a, so B's win frequency for this case is 1/2.

Case 4A2: B's second number is less than a. B's win frequency for this case is 0.

Case 4A Summary: For the 4A case (a > b), B's expected value is
(1 - a) * (1 - b) * (1 - a) / (2 * (1 - b))
= (1 - a) * (1 - a) / 2.

Case 4B: In this case, a < b. There are two sub-subcases.

Case 4B1: A's second number is greater than b. This case happens with frequency (1 - b) / (1 - a), and B's win frequency is 1/2 (since both players have a number greater than b).

Case 4B2: A's second number is less than b. This case happens with frequency
((1 - a) - (1 - b)) / (1 - a). B always wins this case.

Case 4B Summary: For the 4B case (a < b), B's expected value is
(1 - a) * (1 - b) * (1 - b) / (2 * (1 - a)) + (1 - a) * (1 - b) * ((1 - a) - (1 - b)) / (1 - a)
= (1 - b) * (1 - b) / 2 + (1 - b) * (b - a)

Now let's look at the case where a < b and put it all together. From above, combining all the cases we see that B's expected value is:
= a * b / 2
+ a * (1 - b) * b * 1 + a * (1 - b) * (1 - b) / 2
+ (1 - a) * b * (1 - a) / 2
+ (1 - b) * (1 - b) / 2 + (1 - b) * (b - a)

Converting this to a function of b, we simplify this to:
= (-a/2 - 1/2) * b^2
+ (3a/2 + (1-a)^2 / 2) * b
+ (1/2 - a/2)

 If a < b, then player B's expected value is: = (-a/2 - 1/2) * b^2 + (3a/2 + (1-a)^2 / 2) * b + (1/2 - a/2)

Note, we can sanity-check our formula by seeing what happens when a = b (and confirming that B's expected value is 0.5, and it is), and when a = 0 and b = 1/2, B's expected value is 5/8 as we computed earlier.

But what we really want to do is find the best strategy. We can find the best strategy for B by finding the maximum value of the above formula. To do this we can simply take the derivative of the formula to find a local maximum. Remembering our calculus, the derivative is:
(-a-1) * b + 3a/2 + (1-a)^2/2

When the derivative is 0, we have our maximum:
(-a-1) * b + 3a/2 + (1-a)^2/2 = 0
(-a-1)*b = -3a/2 - (1-a)^2/2
(-a-1)*b = (-3a-1-a^2+2a)/2
(-a-1)*b = (-a^2-a-1)/2
b = (-a^2-a-1)/(-2a-2)
b = (a^2+a+1)/(2a+2)

This gives us the best strategy for player B if we know player A's strategy.

 If we know player A's strategy (with cutoff a), then player B's best strategy is to use the cutoff: b = (a^2+a+1)/(2a+2)

For example, if a = 0, then solving the above gives us b = 1/2.
If, on the other hand, a = 1/2, then solving the above gives us
b = (1/4 + 1/2 + 1)/(1 + 2)
b = 7/12 = 0.583333...

The optimal strategy would be one where solving the above gives us the same value (i.e., both A and B picked the same cutoff value, and neither could choose a different value to win more often than 1/2 of the time). This equilibrium is reached when:
b = a = (a^2+a+1)/(2a+2)
a(2a + 2) = a^2 + a + 1
2a^2 + 2a = a^2 + a + 1
a^2 + a - 1 = 0

Now we remember our quadratic formula from basic algebra and find that:
a = (-1 +/ sqrt(5)) / 2
a = (sqrt(5) - 1) / 2
a = approximately 0.618.

Note that equilibrium is reached when the cutoff value is 1 less than what is known as the golden ratio ((1 + sqrt(5)) / 2).

 The optimal strategy is to use (sqrt(5) - 1) / 2 = 0.618... as your cutoff value. As long as your opponent uses any other strategy, you will win more than 50% of the time.

In practice, how much difference does it make? Not much, actually. If A chooses a cutoff of 0.5, and B chooses the optimal cutoff of 0.618, then B's expected winning percentage would be 50.4%. As a poker player would say, that's equivalent to a coin flip. Even if you used the best strategy (0.5833...) for an opponent known to use 0.5, your winning percentage would be 50.5%.

There are other, probably simpler ways of solving this problem. But I liked my solution for a variety of reasons. First, the probabilities involved seemed daunting at first, but they were actually fairly straightforward once they were divided into manageable cases. Then there was some simple algebra, some very basic calculus (which I almost never use), and then even the use of the quadratic formula.

Kevin's Math Page