retail-demo-store
retail-demo-store copied to clipboard
Update the `beta distribution` parameters
Update the beta distribution parameters in the _select_variation_index
method to avoid bias towards lower success probability.
The current specification of the beta distribution:
theta = np.random.beta(conversions + 1, exposures + 1)
treats every exposure as a failure, that is overstates the failures thus undervalues the success probabilities of the variations. The effect is pronounced for variations with very high baseline conversion rates but less severe for variations with extremely low conversion rates.
Issue #, if available: Similar to: #643
Description of changes:
Traditionally, the Thompson Sampling Algorithm for the Bernoulli Bandit Thompson Sampling algorithm is:
\begin{align*}
1: & \text{for } t = 1, 2, \ldots \text{ do:} \\
2: & \quad \quad \text{Sample model:} \\
3: & \quad \quad \text{for } k = 1 \text{ to } K \text{ do:} \\
4: & \quad \quad \quad \text{Sample } \theta_k \sim \text{beta}(\alpha_k, \beta_k) \\
5: & \quad \quad \text{end for} \\
6: & \quad \quad \text{Select and apply action:} \\
7: & \quad \quad x_t \leftarrow \arg\max_k \theta_k \\
8: & \quad \quad \text{Apply } x_t \text{ and observe } r_t \\
9: & \quad \quad \text{Update distribution:} \\
10: & \quad \quad (\alpha_{x_t}, \beta_{x_t}) \leftarrow (\alpha_{x_t} + r_t, \beta_{x_t} + 1 - r_t) \\
11: & \text{end for}
\end{align*}
Where α, β represent the parameters of each arm i.e. the success and failure counts, respectively OR the number of conversions
and non-conversions
, respectively.
non-conversions (or beta) = exposures - conversions
Description of testing performed to validate your changes (required if pull request includes CloudFormation or source code changes):
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.