how to set train_yvar with A/B testing data, if i want the FixedNoiseGP model
As i questioned before https://github.com/facebook/Ax/issues/1812; i need to use FixedNoiseGP model;
import numpy as np
import pandas as pd
n_users = 20
n_control = n_users // 2
n_test = n_users // 2
def fetch_data_from_experiment(parameters: int) -> pd.DataFrame:
"""
Get A/B test data from wherever it is stored and return
a Pandas DataFrame where the rows are at the unit level,
each column is a different outcome, and the units included
are the control group and any test group users who
received the treatment specified by `parameters`. A column `group`
must specify who is in the test group and who is in the control group.
For this example, this is fake data. Realistically, you would
need to run a real A/B test and then load the data from
some database.
"""
raw_data = pd.DataFrame(
{
"metric_1": np.random.normal(0, 1, n_users),
"metric_2": np.random.normal(0, 1, n_users),
"group": ["control" for _ in range(n_control)] + ["test" for _ in range(n_test)]
}
)
return raw_data
def evaluate(parameters) -> Dict[str, Tuple[float, float]]:
"""
Return the difference in means between units assigned to treatment
`parameters` and the control group, and the standard error of that
difference.
"""
raw_data = fetch_data_from_experiment(parameters)
metrics = ["metric_1", "metric_2"]
grouped = raw_data.groupby("group")
# First calculate means and variances separately for the test and control groups
means = grouped[metrics].mean()
variances = grouped[metrics].var()
n = grouped.agg(n=("group", "count"))["n"]
# Then get the difference in means between groups, and standard error of that difference
difference_in_means = means.loc["test"] - means.loc["control"]
variance_of_difference_in_means = variances.loc["test", :] / n.loc["test"] + variances.loc["control"] / n.loc["control"]
sem = np.sqrt(variance_of_difference_in_means)
return {
metric: (difference_in_means.loc[metric], sem.loc[metric])
for metric in metrics
}
Above code is setting a standard error;so in below code i did not set "surrogate": Surrogate(FixedNoiseGP)
client = AxClient(
generation_strategy=GenerationStrategy(steps=[
GenerationStep(model=Models.SOBOL, num_trials=exp_group_nums * prepare_rounds,
min_trials_observed=exp_group_nums),
GenerationStep(model=Models.BOTORCH_MODULAR, num_trials=-1, min_trials_observed=exp_group_nums,
model_kwargs={
"botorch_acqf_class": acqf}
)
])
)
i think it is right;
if i set "surrogate": Surrogate(FixedNoiseGP) in the model_kwargs as below,i need to set a variance rather than a standard error in the evaluate function,am i right?
client = AxClient(
generation_strategy=GenerationStrategy(steps=[
GenerationStep(model=Models.SOBOL, num_trials=exp_group_nums * prepare_rounds,
min_trials_observed=exp_group_nums),
GenerationStep(model=Models.BOTORCH_MODULAR, num_trials=-1, min_trials_observed=exp_group_nums,
model_kwargs={ "surrogate": Surrogate(FixedNoiseGP)
"botorch_acqf_class": acqf}
)
])
)
Why do you need to set the model type manually? As mentioned on the previous issue, Ax will choose the FixedNoiseGP (which as of the latest BoTorch release was actually deprecated, and its functionality merged into the SingleTaskGP) automatically if you provide noise observations: https://github.com/facebook/Ax/issues/1812#issuecomment-1705654530
Another response on the previous issue also explains that yes, the noise is the variance of the observation provided to ax: https://github.com/facebook/Ax/issues/1812#issuecomment-1702880651
If you're using FixedNoiseGP, you need to pass a variance rather than a standard error -- the square of the "sem" computed above.
@Balandat hi, thank you for your replying; because in the #1812 , @esantorella use standard error in the evaluate function, so i am confused;when should i use a variance and when should i use a standard error
You're right, I may have missed some context above. On the botorch side, the model takes the variance of the observation, but the interface on the Ax side does indeed expect the standard error (which will under the hood be converted to the variance when the model gets instantiated). So regardless of model always use the standard error on the Ax side (incl. the evaluation function).
Closing due to lack of activity and resolution seeming to have been made, @cyrilmyself feel free to re-open or open a new issue for further help :)