a thoughtful web.
Good ideas and conversation. No ads, no tracking.   Login or Take a Tour!
comment by wasoxygen
wasoxygen  ·  2761 days ago  ·  link  ·    ·  parent  ·  post: Probably Overthinking It: Bayes' Theorem is not optional

The four-part Probability is Hard series convinced me of the utility of using a simulator to map out the probability space. Sometimes it seems like overkill to actually write and execute the code, but with these tricky problems it's good to be thorough.

Let's see if I can talk my way through how a simulator would work.

I simulate a million days in Seattle. They break down as follows:

300,000 rainy days

700,000 sunny days

Each day, I call Albert.

On 200,000 rainy days, Albert is honest and says "yes it is raining."

On 100,000 rainy days, Albert lies and says "no it is not raining."

On 466,667 sunny days, Albert is honest and says "no it is not raining."

On 233,333 sunny days Albert lies and says "yes it is raining."

In this problem Albert says yes. So it is actually raining in 200,000 out of 433,333 cases in which I hear yes, about 46% of the time.

Next I call Betty. Ignoring the cases in which Albert said "no" I find four cases:

Of the 200,000 rainy days on which Albert said "yes," Betty will be honest and say "yes" on 133,333 days.

Of the 200,000 rainy days on which Albert said "yes," Betty will lie and say "no" on 66,667 days.

Of the 233,333 sunny days on which Albert said "yes," Betty will be honest and say "no" on 155,556 days.

Of the 233,333 sunny days on which Albert said "yes," Betty will lie and say "yes" on 77,778 days.

In this problem Betty says yes. So it is actually raining in 133,333 days out of 211,111 days in which I hear a second yes, about 63% of the time.

Next I call Charlie. Ignoring the cases in which Betty said "no" I find four cases:

Of the 133,333 rainy days on which Albert and Betty said "yes," Charlie will be honest and say "yes" on 88,889 days.

Of the 133,333 rainy days on which Albert and Betty said "yes," Charlie will lie and say "no" on 44,444 days.

Of the 77,778 sunny days on which Albert and Betty said "yes," Charlie will be honest and say "no" on 51,852 days.

Of the 77,778 sunny days on which Albert and Betty said "yes," Charlie will lie and say "yes" on 25,926 days.

In this problem Charlie says yes. So it is actually raining in 88,889 days out of 114,815 in which I hear yes three times, about 77% of the time.

This does not match the given answer of 47%, but my prior estimate of rain (30%) is higher than the 10% used in the article. If my arithmetic is correct, this tedious approach seems like a reliable way to get a result that is comprehensible.

I am still not sure how to use the formula to get an answer; the article throws in a "Bayes factor" which makes sense but seems like a shortcut.





Devac  ·  2761 days ago  ·  link  ·  

Perhaps it's shoddy coding on my part, but I got different result as well.

  Sample size for simulation:	1000000

It was actually raining for 111801 days out of a 1000000

  In following the third person was saying the opposite:

  Albert and Betty said yes	82200

Albert and Charlie said yes 81833

Betty and Charlie said yes 82640

  Albert and Betty said yes and it was true	49596

Albert and Charlie said yes and it was true 49250

Betty and Charlie said yes and it was true 49631

  Albert and Betty said yes and it was a lie	16646

Albert and Charlie said yes and it was a lie 16300

Betty and Charlie said yes and it was a lie 16681

  It was raining and everyone said true	32950

It wasn't raining but everyone lied 32658

  Albert

Times said 'yes' truthfully: 74377

Times said 'no' truthfully: 592897

Times said 'yes' and lied: 295302

Times said 'no' and lied: 37424

Betty

Times said 'yes' truthfully: 74422

Times said 'no' truthfully: 592248

Times said 'yes' and lied: 295951

Times said 'no' and lied: 37379

Charlie

Times said 'yes' truthfully: 74326

Times said 'no' truthfully: 592276

Times said 'yes' and lied: 295923

Times said 'no' and lied: 37475

I have used probabilities from the article. One out of three chance for a lie, one out of nine chance of rain being in Seattle. I can share my code if you would like to go for "maybe by spotting a problem with someone else's code I'll get some extra insight" type of exercise. Beware of sleepy Python though ;).

wasoxygen  ·  2760 days ago  ·  link  ·  

    one out of nine chance of rain being in Seattle

They say it rains about 10% of the time, and "A base rate of 10% corresponds to prior odds of 1:9." I found this notation a bit confusing, but I think it corresponds to a one-in-ten chance of rain. For each rainy day, you get nine sunny days.

The difference between probabilities expressed as a value between 0 and 1 and these "odds" if that's what 1:9 is called is likely part of my confusion in following the article.

It was actually Professor Brian who demonstrated how useful a simulation can be, while analyzing this bizarre problem:

    Say you know a family has two children, and further that at least one of them is a girl named Florida. What is the probability that they have two girls?

But a simulator, based on a random number generator, seems like a good way to check our work. I still trust my numbers as long as I feel like I know what I am doing. Say we start with a convenient number of days:

  270 days

243 sunny

27 rainy

Each friend will be expected to give the same ratio of answers. We call Albert first and he responds:

  True  "no"  on 2/3 of 243 sunny days = 162 days

False "yes" on 1/3 of 243 sunny days = 81 days

True "yes" on 2/3 of 27 rainy days = 18 days

False "no" on 1/3 of 27 rainy days = 9 days

So Albert says "yes" on 99 days, and on 81 days it is sunny and on 18 days it is raining. We increase our expectation of rain from 10% to 18/99, about 18%.

We expect the same ratio of responses from Betty:

  True  "no"  on 2/3 of 81 sunny days = 54 days

False "yes" on 1/3 of 81 sunny days = 27 days

True "yes" on 2/3 of 18 rainy days = 12 days

False "no" on 1/3 of 18 rainy days = 6 days

So Betty says "yes" on 39 days, and on 27 days it is sunny and on 12 days it is raining. We increase our expectation of rain from 18/99 to 12/39, about 31%.

We expect the same ratio of responses from Charlie:

  True  "no"  on 2/3 of 27 sunny days = 18 days

False "yes" on 1/3 of 27 sunny days = 9 days

True "yes" on 2/3 of 12 rainy days = 8 days

False "no" on 1/3 of 12 rainy days = 4 days

So Charlie says "yes" on 17 days, and on 9 days it is sunny and on 8 days it is raining. We increase our expectation of rain from 12/39 to 8/17, about 47%.

This seems very clear and agrees with the given answer. But I still haven't used Bayes' theorem.

wasoxygen  ·  2760 days ago  ·  link  ·  

Using odds instead of percentage makes this problem simple.

Our estimate of rain in Seattle is 1:9 (one rainy day for every nine sunny days).

Our friends tell us the truth with odds of 2:1 (two truthful reports for each false report).

So after calling one friend and hearing "yes it's raining" we multiply 1:9 by 2:1 and get odds of 2:9 as our new expectation of rain. That's 2/11, about 18%, so we are more confident of rain.

The second friend says "yes" and we multiply 2:9 by another 2:1 to get 4:9, or 4/13 which is about 31%.

The third friend says "yes" and we multiply 4:9 by 2:1 to get 8:9 which is 8/17, the final answer of about 47%.

I have found a way to force Bayes in, but don't know if it's correct. Replacing the A's and B's with more descriptive terms, Bayes' Theorem is

odds of rain, given a "yes" = odds of "yes," given rain × odds of rain / odds of "yes"

So, after hearing the first "yes" we have

odds of rain, given a "yes" = (2:1 the odds we will hear a "yes" on a rainy day) × (1:9 our prior estimate for rain) / (1, our certainty of hearing "yes" because we just talked with Albert and he definitely said "yes")

This gives us 2:9 divided by 1, which is the desired 2/11 result. That denominator seems forced and flaky, though, and I am not sure what to do if Albert says "no." Perhaps the odds of rain given a "no" are 1:2 × 1:9 / 1 again because we definitely heard "no" for a total of 1:18 or about 5% chance of rain.

Odds are good b_b will be able to straighten this out.