Ax icon indicating copy to clipboard operation
Ax copied to clipboard

how to set train_yvar with A/B testing data, if i want the FixedNoiseGP model

Open cyrilmyself opened this issue 2 years ago • 3 comments

As i questioned before https://github.com/facebook/Ax/issues/1812; i need to use FixedNoiseGP model;

import numpy as np
import pandas as pd
n_users = 20
n_control = n_users // 2
n_test = n_users // 2

def fetch_data_from_experiment(parameters: int) -> pd.DataFrame:
    """
    Get A/B test data from wherever it is stored and return
    a Pandas DataFrame where the rows are at the unit level,
    each column is a different outcome, and the units included
    are the control group and any test group users who
    received the treatment specified by `parameters`. A column `group`
    must specify who is in the test group and who is in the control group.

    For this example, this is fake data. Realistically, you would
    need to run a real A/B test and then load the data from
    some database.
    """
    raw_data = pd.DataFrame(
        {
            "metric_1": np.random.normal(0, 1, n_users),
            "metric_2": np.random.normal(0, 1, n_users),
            "group": ["control" for _ in range(n_control)] + ["test" for _ in range(n_test)]
        }
    )
    return raw_data
    

def evaluate(parameters) -> Dict[str, Tuple[float, float]]:
    """
    Return the difference in means between units assigned to treatment
    `parameters` and the control group, and the standard error of that
    difference.
    """
    raw_data = fetch_data_from_experiment(parameters)
    metrics = ["metric_1", "metric_2"]
    grouped = raw_data.groupby("group")
    # First calculate means and variances separately for the test and control groups
    means = grouped[metrics].mean()
    variances = grouped[metrics].var()
    n = grouped.agg(n=("group", "count"))["n"]
    # Then get the difference in means between groups, and standard error of that difference
    difference_in_means = means.loc["test"] - means.loc["control"]
    variance_of_difference_in_means = variances.loc["test", :] / n.loc["test"] + variances.loc["control"] / n.loc["control"]
    sem = np.sqrt(variance_of_difference_in_means)
    return {
        metric: (difference_in_means.loc[metric], sem.loc[metric])
        for metric in metrics
    }

Above code is setting a standard error;so in below code i did not set "surrogate": Surrogate(FixedNoiseGP)

client = AxClient(
                generation_strategy=GenerationStrategy(steps=[
                    GenerationStep(model=Models.SOBOL, num_trials=exp_group_nums * prepare_rounds,
                                   min_trials_observed=exp_group_nums),
                    GenerationStep(model=Models.BOTORCH_MODULAR, num_trials=-1, min_trials_observed=exp_group_nums,
                                   model_kwargs={
                                                 "botorch_acqf_class": acqf}
                                   )
                ])
            )

i think it is right;

if i set "surrogate": Surrogate(FixedNoiseGP) in the model_kwargs as below,i need to set a variance rather than a standard error in the evaluate function,am i right?

client = AxClient(
                generation_strategy=GenerationStrategy(steps=[
                    GenerationStep(model=Models.SOBOL, num_trials=exp_group_nums * prepare_rounds,
                                   min_trials_observed=exp_group_nums),
                    GenerationStep(model=Models.BOTORCH_MODULAR, num_trials=-1, min_trials_observed=exp_group_nums,
                                   model_kwargs={ "surrogate": Surrogate(FixedNoiseGP)
                                                 "botorch_acqf_class": acqf}
                                   )
                ])
            )

cyrilmyself avatar Nov 03 '23 10:11 cyrilmyself

Why do you need to set the model type manually? As mentioned on the previous issue, Ax will choose the FixedNoiseGP (which as of the latest BoTorch release was actually deprecated, and its functionality merged into the SingleTaskGP) automatically if you provide noise observations: https://github.com/facebook/Ax/issues/1812#issuecomment-1705654530

Another response on the previous issue also explains that yes, the noise is the variance of the observation provided to ax: https://github.com/facebook/Ax/issues/1812#issuecomment-1702880651

If you're using FixedNoiseGP, you need to pass a variance rather than a standard error -- the square of the "sem" computed above.

Balandat avatar Nov 03 '23 13:11 Balandat

@Balandat hi, thank you for your replying; because in the #1812 , @esantorella use standard error in the evaluate function, so i am confused;when should i use a variance and when should i use a standard error

cyrilmyself avatar Nov 06 '23 03:11 cyrilmyself

You're right, I may have missed some context above. On the botorch side, the model takes the variance of the observation, but the interface on the Ax side does indeed expect the standard error (which will under the hood be converted to the variance when the model gets instantiated). So regardless of model always use the standard error on the Ax side (incl. the evaluation function).

Balandat avatar Nov 06 '23 15:11 Balandat

Closing due to lack of activity and resolution seeming to have been made, @cyrilmyself feel free to re-open or open a new issue for further help :)

mgarrard avatar Jul 24 '24 20:07 mgarrard