gpytorch icon indicating copy to clipboard operation
gpytorch copied to clipboard

Scalable SM Kernel [Feature Request]

Open wjmaddox opened this issue 5 years ago • 4 comments

🚀 Feature Request

This is somewhere between a bug and feature request. Attempting to use spectral mixture kernels on reasonably sized data (100 x 100 grids) but it OOMs at test time. It is possibly caching related.

Feature request would probably be KeOps implementation for SM kernels or a more faithful implementation of Kronecker based inference for GPs .

Motivation

There is an explicit matrix being formed in the kernel's forwards pass here which is where memory issue is occurring.

Pitch

I (or @g-benton) am willing to open a PR but it may take a while - we just want a comparison to #872 .

Minimal Working Example

import torch 
import gpytorch
import math

if torch.cuda.is_available():
    torch.set_default_tensor_type(torch.cuda.FloatTensor)

# creat training grid
grid_bounds = [(0, 1), (0, 1)]
grid_size = 70
grid = torch.zeros(grid_size, len(grid_bounds))
for i in range(len(grid_bounds)):
    grid_diff = float(grid_bounds[i][1] - grid_bounds[i][0]) / (grid_size - 2)
    grid[:, i] = torch.linspace(grid_bounds[i][0] - grid_diff, grid_bounds[i][1] + grid_diff, grid_size)

train_x = gpytorch.utils.grid.create_data_from_grid(grid)
train_y = torch.sin((train_x[:, 0] + train_x[:, 1]) * (2 * math.pi)) + torch.randn_like(train_x[:, 0]).mul(0.01)

# setup model
class GridSM(gpytorch.models.ExactGP):
    def __init__(self, grid, train_x, train_y, likelihood):
        super(GridSM, self).__init__(train_x, train_y, likelihood)
        num_dims = train_x.size(-1)
        self.mean_module = gpytorch.means.ConstantMean()
        self.base = gpytorch.kernels.SpectralMixtureKernel(num_mixtures=20, ard_num_dims=2)
        self.base.initialize_from_data(train_x, train_y)
        self.covar_module = gpytorch.kernels.GridKernel(self.base, grid=grid)
        #self.covar_module = self.base

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)

likelihood = gpytorch.likelihoods.GaussianLikelihood()
model = GridSM(grid, train_x, train_y, likelihood)
test_x = model(train_x)

# setup testing grid
grid_bounds = [(1, 2), (0, 1)]
grid_size = 50
test_grid = torch.zeros(grid_size, len(grid_bounds))
for i in range(len(grid_bounds)):
    grid_diff = float(grid_bounds[i][1] - grid_bounds[i][0]) / (grid_size - 2)
    test_grid[:, i] = torch.linspace(grid_bounds[i][0] - grid_diff, grid_bounds[i][1] + grid_diff, grid_size)

test_x = gpytorch.utils.grid.create_data_from_grid(test_grid)

# now evaluate
model.eval()
with gpytorch.settings.fast_pred_var(True), gpytorch.settings.skip_posterior_variances(True):
    predictive_dist = model(test_x)

wjmaddox avatar Sep 23 '19 22:09 wjmaddox

Honestly, a KeOps implementation would probably be the way to go.

gpleiss avatar Oct 02 '19 21:10 gpleiss

Is Spectral Mixture Kernel more memory consuming? It seems that I will incur an out-of-memory warning with about 1000 data points and 54 features, where a RBF kernel works just fine.

ginward avatar Jan 24 '21 14:01 ginward

Can you explain in a bit more detail?

It doesn't look like the proximal issue of an explicit matrix being formed in the forwards pass has been resolved (still here. However, the above code now runs on my GPU because of the improvements made to KroneckerProductLazyTensor over the past year or so (I should have probably closed the issue before).

wjmaddox avatar Jan 24 '21 15:01 wjmaddox

Has a PR already been opened? If help is needed, I could help out.

anjawa avatar Mar 22 '22 15:03 anjawa