botorch Model specific Arguments in batch_cross

Hi Botorch developers,

I came across the functionality of performing LOOCV directly within botorch and I like it. Only one thing is missing for me: One cannot pass modelspecific arguments into the method batch_cross_validation. Any reason for this?

By adding the argument model_args: Optional[Dict[str, Any]] = None and inserting kwargs.update(model_args) at line 153 enables cross validations also for models with input and output transforms as shown below:

cv_results = batch_cross_validation(
     model_cls=SingleTaskGP,
     mll_cls=ExactMarginalLogLikelihood,
     cv_folds=cv_folds,
     model_args={
         "outcome_transform":Standardize(m=1,batch_shape=torch.Size([186])), 
         "input_transform":Normalize(d=train_X.shape[1], batch_shape=torch.Size([186]))
     }
)

This is the modified batch_cross_validation method:

def batch_cross_validation(
    model_cls: Type[GPyTorchModel],
    mll_cls: Type[MarginalLogLikelihood],
    cv_folds: CVFolds,
    fit_args: Optional[Dict[str, Any]] = None,
    model_args: Optional[Dict[str, Any]] = None,
    observation_noise: bool = False,
) -> CVResults:
   
    fit_args = fit_args or {}
    kwargs = {
        "train_X": cv_folds.train_X,
        "train_Y": cv_folds.train_Y,
        "train_Yvar": cv_folds.train_Yvar,
    }
    kwargs.update(model_args)
    
    model_cv = model_cls(**_filter_kwargs(model_cls, **kwargs))
    mll_cv = mll_cls(model_cv.likelihood, model_cv)
    mll_cv.to(cv_folds.train_X)
    mll_cv = fit_gpytorch_model(mll_cv, **fit_args)

    # Evaluate on the hold-out set in batch mode
    with torch.no_grad():
        posterior = model_cv.posterior(
            cv_folds.test_X, observation_noise=observation_noise
        )

    return CVResults(
        model=model_cv,
        posterior=posterior,
        observed_Y=cv_folds.test_Y,
        observed_Yvar=cv_folds.test_Yvar,
    )

If you think this functionality makes sense, feel free to add it ;)

Best,

Johannes

Feb 19 '21 12:02 jduerholt

Hi @jduerholt, just in case you're looking for a way around this without modifying the source code, https://github.com/pytorch/botorch/issues/691#issuecomment-780625400 shows how to do this by wrapping the acquisition function. While writing that comment, I also felt like the ability to pass extra arguments would be useful. Let's see what others think about this.

Feb 19 '21 14:02 saitcakmak

This sounds like a useful thing to have. Thanks for the suggestion. I'll put up a PR.

Feb 19 '21 16:02 Balandat

Hmm there is some complication here since some of the transforms also take a batch_shape arg, and it's on the user to figure out how to modify this in a way so that it properly works with the CV internals. Will have to think about this a bit more.

Feb 19 '21 19:02 Balandat

Hi Max,

you are right, the user has to define the batch_shap arg, but from my perspective this is not a problem as he can do this also in the solution depicted above, since he would pass the already instantiated transformers to the method. See the example below:

cv_results = batch_cross_validation(
     model_cls=SingleTaskGP,
     mll_cls=ExactMarginalLogLikelihood,
     cv_folds=cv_folds,
     model_args={
         "outcome_transform":Standardize(m=1,batch_shape=torch.Size([186])), 
         "input_transform":Normalize(d=train_X.shape[1], batch_shape=torch.Size([186]))
     }
)

But the solution with a "new Model" incorporating directly the transforms as pointed out by @saitcakmak is also fine. I only thought that it could be easier when this step is not necessary.

Best,

Johannes

Feb 22 '21 06:02 jduerholt

So the other option could be to make the transform API a bit richer so as to allow to programmatically expand to batch dimensions.

Then one could do something like the following:

new_batch_shape = cv_folds.train_X
if hasattr(model, "input_transform"):
    new_kwargs["input_transform"] = model.input_transform.expand_batch(new_batch_shape)
if hasattr(model, "outcome_transform"):
    new_kwargs["outcome_transform"] = model.outcome_transform.expand_batch(new_batch_shape)

Currently we don't have access to the model inside the batch_cross_validation call. We could instead allow for the model (not the class) to be passed as the arg, then internally we could use model.__class__() to construct the new batched model.

Thoughts on this?

Feb 22 '21 15:02 Balandat

So you mean to not pass the class but the instantiated model to the batch_cross_validation method. Then to expand the transforms and reinstantiate the model with the expanded scalers?

This would be fine for me. One should then also add the possibility to specify the kernel (covar_module).

Feb 22 '21 16:02 jduerholt

I personally don't like the idea of passing a model instance which does not get used. Since we already pass a cv_folds argument which has the training data, passing in a model instance only serves as a container for the transform, which then gets modified within batch_cross_validation. I think adding a model_kwargs argument to batch_cross_validation, passing the transform class as model_kwargs = {"outcome_transform": Standardize}, and constructing it within batch_cross_validation with the appropriate batch shape is a more elegant solution (could look for it as in @Balandat's comment). In this scenario, model_kwargs could also carry all the other custom arguments, such as covar_module.

Feb 22 '21 17:02 saitcakmak

I like the idea of just passing the transform class as model_kwargs = {"outcome_transform": Standardize} and then constructing it within batch_cross_validation. Could be then necessary to also be able to pass transform specific kwargs.

Best,

Johannes

Feb 25 '21 14:02 jduerholt

Could be then necessary to also be able to pass transform specific kwargs

Yeah that's going to make the interface a bit clunky, you'd have to do sth like

cv_results = batch_cross_validation(
     model_cls=SingleTaskGP,
     mll_cls=ExactMarginalLogLikelihood,
     cv_folds=cv_folds,
     model_args={
         "outcome_transform": Transform(arg1=val1), 
     },
     model_arg_args={"outcome_transform": {"arg1": val1}},
 )

Feb 25 '21 15:02 Balandat

botorch
botorch copied to clipboard

Model specific Arguments in batch_cross_validation

botorch botorch copied to clipboard

Model specific Arguments in batch_cross_validation

botorch
botorch copied to clipboard