botorch icon indicating copy to clipboard operation
botorch copied to clipboard

Construct the object Posterior for DeepGP

Open Rimabaidi opened this issue 3 years ago • 2 comments

Issue description

Hello, I built a Multi-fidelity deep GP (https://arxiv.org/pdf/1903.07320.pdf) on top of the class DeepGP. After fitting the GP, I get 10 samples from the distribution. I need to build the object Posterior in order to pass it to the acquisition function. I know I need to average the mean and the variance (covariance?) of the prediction and then pass them to the MVN but I am a bit confused. Just from the shape, I know it is not behaving correctly. Here is my code for the posterior. I can also provide the code for the GP and the example I am working on.

Code example

def posterior(
    self, X: Tensor, observation_noise: Union[bool, Tensor] = False, **kwargs: Any
) -> GPyTorchPosterior:
    self.eval()  # make sure model is in eval mode
    with gpt_posterior_settings():
        mvn = self(X)
        mvn_mean= torch.mean(mvn.mean, [0]) # What I add to the standard posterior
        mvn_cov = torch.mean(mvn.covariance_matrix, [0])  # What I add to the standard posterior
        mvn = MultivariateNormal(mvn_mean, mvn_cov)  # What I add to the standard posterior
        if observation_noise is not False:
            if torch.is_tensor(observation_noise):
                # TODO: Make sure observation noise is transformed correctly
                self._validate_tensor_args(X=X, Y=observation_noise)
                if observation_noise.size(-1) == 1:
                    observation_noise = observation_noise.squeeze(-1)
                mvn = self.likelihood(mvn, X, noise=observation_noise)
            else:
                mvn = self.likelihood(mvn, X)
    posterior = GPyTorchPosterior(mvn=mvn)
    return posterior

System Info

  • Botorch 0.4.0
  • GPytorch 1.4.1
  • Pytorch 1.8.1
  • Computer Windows 10

Rimabaidi avatar May 26 '21 07:05 Rimabaidi

Would you mind providing a full (non-)working example so we can debug from there? From a high level your approach seems to make sense but I'm worried how you're just trying to average the variances rather than using the law of total variance. Another option would be to just compute the acquisition function in batch for each sample and then average its values. That's technically not the acquisition function computed on the model posterior, but in other contexts (eg BayesOpt using fully Bayesian inference) we've seen this working well in practice.

Balandat avatar May 31 '21 17:05 Balandat

Thank you for your answer. I managed to make it work actually but I wasn't aware of the law of total variance. I average now the acq function's values. But I feel the performance is not so good (compared to KG with MFSingleTaskGP). Maybe I am doing something wrong. I uploaded the code ( I hope you can use it). The script I am running to test is under: tutorials/currin_mf_bo_ei.py I would be grateful if you could take a look and let me if I am missing something :) Code.zip

Rimabaidi avatar Jun 01 '21 14:06 Rimabaidi

Closing as I assume this is no longer helpful after a year and a half, but let us know if by any chance you are still stuck on this.

esantorella avatar Jan 30 '23 19:01 esantorella