pymc-examples
pymc-examples copied to clipboard
Unused prior in Bayesian AB testing case study
Notebook title: What is A/B testing? Notebook url: https://github.com/pymc-devs/pymc-examples/blob/main/examples/case_studies/bayesian_ab_testing.ipynb
Issue description
In the RevenueModel class the converted object goes unused:
converted = pm.Binomial(
"converted", n=visitors, p=theta, observed=purchased, shape=num_variants
)
revenue = pm.Gamma(
"revenue", alpha=purchased, beta=lam, observed=total_revenue, shape=num_variants
)
The following line uses purchased (which is supposed to represent collected data) as alpha - I'm guessing that should be converted?
Expected output
I'm a beginner Bayesian, so I'm not sure if the output is unexpected or not.
Proposed solution
In a local copy of the code, I replaced purchased with converted + 1e-8 (plus a small epsilon) in the revenue prior, and the code ran pretty much the same as before.
This is an interesting observation! I particular because of the model specification
I would love to hear more opinions :)
I don't have any domain knowledge regarding this model but I suspect you are technically right.
Using
pm.Binomial("converted", n=visitors, p=theta, observed=purchased, shape=num_variants)
pm.Gamma("revenue", alpha=purchased, beta=lam, observed=total_revenue, shape=num_variants)
as it is done now is not exactly the same as it would be doing
converted = pm.Binomial("converted", n=visitors, p=theta, observed=purchased, shape=num_variants)
pm.Gamma("revenue", alpha=converted, beta=lam, observed=total_revenue, shape=num_variants)
However, the two models are equivalent for all tasks and uses in that notebook. Both converted and revenue have an observed kwarg, they are likelihood terms, they do not define priors. They are not sampled during pm.sample, and using converted or purchased is equivalent when it comes to posterior sampling, same goes for using revenue and total_revenue variables.
The differences between the two models are only relevant when it comes to prior/posterior predictive sampling. No posterior predictive sampling happens at all anywhere in the notebook from what I can see. And while samples from the prior predictive are generated, their values are not reported anywhere. The only value reported is reluplift_1 which depends on theta and lam, it is a deterministic of two prior variables.
If we sampled from the posterior predictive, what we'd be doing is the following. For every posterior sample, take the needed variables and generate samples for converted, using a Binomial distribution with the provided n and p (here the observed values are ignored), that is, for each posterior sample we generate one draw from converted that is generated from a different distribution (all Binomial but with different n and p as each posterior sample has slightly different values for them). Once this has happened the two alternatives diverge here, when we generate samples for revenue:
- If
alpha=purchased,betawill be different for each posterior sample, being the correspondinglamvalue, butalphawill not. - If
alpha=converted, bothalphaandbetawill be different for each posterior sample.betawill continue to be the correspondinglamand nowalphawill be the values we have just generated forconvertedwhich both due to the random sampling step involved in generatingconvertedsamples and the fact that the parameters of the distributon are different for each draw we generate will be significantly different than usingpurchasedand keeping it fixed.
Note: making the change you suggest could also break sample_prior_predictive or sample_posterior_predictive, as the samples generated for converted may be 0 which I think is not valid for alpha in gamma distributions, you might need some "add 1e-10" trick so the domain of the values in converted is $\gt 0$ instead of $\geq 0$