[Bug] Inconsistent behavior: `standardize` vs. `Standardize` with n < 2

Open AdrianSosic opened this issue 8 months ago • 8 comments

🐛 Bug

I am currently playing around with situations where there's no training data available yet and noticed that the behaviors of utils.standardize and transforms.Standardize are inconsistent.

To reproduce

Here a minimal example adopted from the code on the landing page. When you run the original code for the GP creation (in the comments), everything works fine. However, when you run the displayed version, you get the error shown below.

Code snippet to reproduce

import torch
from botorch.acquisition import UpperConfidenceBound
from botorch.models import SingleTaskGP
from botorch.models.transforms import Standardize
from botorch.optim import optimize_acqf
from botorch.utils import standardize

# >>>>> changed code
# Unlike in the main page example, we set the number of training points to 0
train_X = torch.rand(0, 2)
# <<<<< changed code

Y = 1 - torch.linalg.norm(train_X - 0.5, dim=-1, keepdim=True)
Y = Y + 0.1 * torch.randn_like(Y)

# >>>>> original version
# train_Y = standardize(Y)
# gp = SingleTaskGP(train_X, train_Y)
# -----
train_Y = Y
gp = SingleTaskGP(train_X, train_Y, outcome_transform=Standardize(1))
# <<<<< version throwing error

# >>>>> changed code
# Because there is no training data, we do not fit the model parameters
gp.eval()
# <<<<< changed code

UCB = UpperConfidenceBound(gp, beta=0.1)
bounds = torch.stack([torch.zeros(2), torch.ones(2)])
candidate, acq_value = optimize_acqf(
    UCB,
    bounds=bounds,
    q=1,
    num_restarts=5,
    raw_samples=20,
)

Stack trace/error message

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

Expected Behavior

In both cases, the GP posterior should simply match the GP posterior and the standardization applied along the way should not mess with the computation. I haven't checked what exact logic is applied internally when there is no data / only one data point being passed to the transformation, but my intuition would tell me that what should happen is:

for one data point, standardization should simply use the point's value as mean and impose some default value (e.g. 1) for the estimated standard deviation
for zero data, same as above but using zero as mean value.

System information

Please complete the following information:

BoTorch Version: 0.11.0
GPyTorch Version: 1.11
PyTorch Version: 2.3.0
OS: macOS

Jun 24 '24 18:06 AdrianSosic

botorch botorch copied to clipboard

[Bug] Inconsistent behavior: `standardize` vs. `Standardize` with n < 2

🐛 Bug

To reproduce

Expected Behavior

System information

botorch
botorch copied to clipboard