a thoughtful web.
Good ideas and conversation. No ads, no tracking.   Login or Take a Tour!
comment by tvirlip
tvirlip  ·  2975 days ago  ·  link  ·    ·  parent  ·  post: Red vs. Blue

Let's remove all the psychology from the game, and assume that Blue's and Red's strategies are independent. Let's say Blue chooses 1 with probability p, and Red chooses 1 with probability q. Then the expected value of one round is

  3 * (probability both chose 1: it is pq)

+ 5 (probability Blue chose 2, Red chose 1: it is (1-p)q)

+ 6 * (probability Blue chose 1, Red chose 2: it is p(1-q))

+ 4 * (probability Blue chose 2, Red chose 2: it is (1-p)(1-q))

Now we are getting:

  value = 3pq + 5(1 - p)q + 6p(1 - q) + 4(1 - p)(1 - q) =

= 3pq + 5q - 5pq + 6p - 6pq + 4 - 4p - 4q + 4pq =

= 4 - 4pq + 2p + q = (2p - 0.5)(1 - 2q) + 4.5

Now, if p = 0.25, the expected value is 4.5 (for Blue); if p is different from that, the expected value may be less, depending on q. If p > 0.25, any q > 0.5 will give resulting value less than 4.5; if p < 0.25, any q < 0.5 will give resulting value less than 4.5. It means that the optimal strategy for Blue is to choose 1 with probability 25% and 2 with probability 75%.

In very much the same way, if q = 0.5, the expected value is 4.5 (for Blue); if q > 0.5 then p < 0.25 gives value bigger than 4.5 and if q < 0.5 then p > 0.25 gives value bigger than 4.5. Since Red wants to minimize the value, the optimal strategy is q = 0.5, i.e. choose 1 and 2 with the same probability.





tvirlip  ·  2975 days ago  ·  link  ·  

I did some computer experiments, just for fun:

both are playing optimal strategy: p = 0.25, q = 0.5. 100000 rounds, total value 449756

Now blue deviates to p = 0.5; red is clever to choose q = 1. Same amount of rounds, 400438. Blue got less than could.

Let's say red deviated to q = 0.25, blue was clever to choose p = 1. Same amount of rounds, 524991. Red lost more than it could.

If blue plays with optimal p = 0.25, red cannot do anything to lose less than 4.5 per round (see above, the whole q-dependent part is multiplied by 0). Let's say p = 0.25, q = 0.75, 100000 rounds: total value 450115.

In a similar way, if red plays with q = 0.5, blue cannot do anything to win more than 4.5 per round. Let's say q = 0.5, p = 0.75: result is 449799.