split icon indicating copy to clipboard operation
split copied to clipboard

Why simulate 10,000 draws using the beta distribution to calculate the winner

Open JotaP opened this issue 8 years ago • 1 comments

Hi @caser, I saw you were the original contributor for the Bayesian Statistics method to calculate the probability of an alternative to be the winner (https://github.com/splitrb/split/pull/251). First of all thanks for that awesome feature.

We have some questions regarding the implementation.

Could you give us a little bit of "statistical" input as to why you decided to make a draw simulation for the Beta distribution? From the top of my mind I understand that the simulation is made to show "valid statistical" results when an experiment has just started and we do not have much information about the conversion rates of each alternative. Is that correct? or is there another reason for the simulation?

I would also like to know why exactly 10,000 simulated draws? Should/could we be playing with other values for the number of simulations (say 1 or 1,000 or 500 or 100,000, etc.)? What would be the consequences of either lowering or rising the simulation runs?

Another question (from a statistical point of view): using the probability of being winner on a given experiment, which would be the recommended value we should be waiting for in order to choose a definitive winning alternative? should we be waiting until any variant reaches say 95% probability of being winner or else? Any input you can give us around the "election of the winner" using the beta distribution method, would be really much appreciated.

Thanks to all the split team for this awesome framework ;)

JotaP avatar Nov 03 '16 17:11 JotaP

I am also interested with what question @JotaP @caser

AlejandroJL avatar Nov 04 '16 08:11 AlejandroJL