ThinkBayes2 icon indicating copy to clipboard operation
ThinkBayes2 copied to clipboard

What is considered to be the "data" in the M&M solution? (Ch 2: Bayes's Theorem)

Open rigdern opened this issue 2 years ago • 1 comments

The problem hint indicates the trick is defining the hypotheses and data carefully:

Hint: The trick to this question is to define the hypotheses and the data carefully.

The solution clearly states the "hypotheses":

# Hypotheses:
# A: yellow from 94, green from 96
# B: yellow from 96, green from 94

What is the "data" in this solution?


My guess is the data is the color mixes in the bags (e.g. 1994 bag has 30% brown, etc. 1996 bag has 24% blue, etc.).

With this data definition, the first likelihood would be represented by "the probability of the color mix given that yellow is from 94 and green is from 96". It's not clear to me why that probability would be represented by 0.2*0.2. Maybe I have to think more or maybe I've guessed wrong about the data definition.

rigdern avatar Apr 27 '22 23:04 rigdern

I wouldn't say I have the most thorough understanding of this myself but I think the data is referring to the color mixes as you have said. I would look at probability trees for a more intuitive understanding of the multiplication taking place. What part of the probability being represented by 0.2*0.2 isn't clear?

HelloWorld183L avatar May 07 '22 20:05 HelloWorld183L

I would say that the mixes in the bags are background information, and the data is the color of the two M&Ms that were drawn. But the line between data and background information can be fuzzy. The important part is that you can compute the likelihood of the outcome under each hypothesis.

AllenDowney avatar Feb 15 '23 16:02 AllenDowney