botorch
botorch copied to clipboard
Construct the object Posterior for DeepGP
Issue description
Hello, I built a Multi-fidelity deep GP (https://arxiv.org/pdf/1903.07320.pdf) on top of the class DeepGP. After fitting the GP, I get 10 samples from the distribution. I need to build the object Posterior in order to pass it to the acquisition function. I know I need to average the mean and the variance (covariance?) of the prediction and then pass them to the MVN but I am a bit confused. Just from the shape, I know it is not behaving correctly. Here is my code for the posterior. I can also provide the code for the GP and the example I am working on.
Code example
def posterior(
self, X: Tensor, observation_noise: Union[bool, Tensor] = False, **kwargs: Any
) -> GPyTorchPosterior:
self.eval() # make sure model is in eval mode
with gpt_posterior_settings():
mvn = self(X)
mvn_mean= torch.mean(mvn.mean, [0]) # What I add to the standard posterior
mvn_cov = torch.mean(mvn.covariance_matrix, [0]) # What I add to the standard posterior
mvn = MultivariateNormal(mvn_mean, mvn_cov) # What I add to the standard posterior
if observation_noise is not False:
if torch.is_tensor(observation_noise):
# TODO: Make sure observation noise is transformed correctly
self._validate_tensor_args(X=X, Y=observation_noise)
if observation_noise.size(-1) == 1:
observation_noise = observation_noise.squeeze(-1)
mvn = self.likelihood(mvn, X, noise=observation_noise)
else:
mvn = self.likelihood(mvn, X)
posterior = GPyTorchPosterior(mvn=mvn)
return posterior
System Info
- Botorch 0.4.0
- GPytorch 1.4.1
- Pytorch 1.8.1
- Computer Windows 10
Would you mind providing a full (non-)working example so we can debug from there? From a high level your approach seems to make sense but I'm worried how you're just trying to average the variances rather than using the law of total variance. Another option would be to just compute the acquisition function in batch for each sample and then average its values. That's technically not the acquisition function computed on the model posterior, but in other contexts (eg BayesOpt using fully Bayesian inference) we've seen this working well in practice.
Thank you for your answer. I managed to make it work actually but I wasn't aware of the law of total variance. I average now the acq function's values. But I feel the performance is not so good (compared to KG with MFSingleTaskGP). Maybe I am doing something wrong. I uploaded the code ( I hope you can use it). The script I am running to test is under: tutorials/currin_mf_bo_ei.py I would be grateful if you could take a look and let me if I am missing something :) Code.zip
Closing as I assume this is no longer helpful after a year and a half, but let us know if by any chance you are still stuck on this.