gpytorch icon indicating copy to clipboard operation
gpytorch copied to clipboard

Issues writing convolutional kernel

Open tmuntianu opened this issue 4 years ago • 6 comments

Hi, I realize this question might be a bit basic, but I'm trying to implement a convolutional kernel as described here on page 5, largely along the lines of the GPFlow implementation.

I've got the patch extraction working in PyTorch, but am running into an issue with the RBFKernel. I have x1 and x2 with dimensions N x P x patch_len where N is the batch size, P is the number of patches (basically a second batch dimension), and patch_len is the length of an individual patch. I need a covariance matrix of dimension N x P x N x P as output from RBFKernel, but I haven't managed to get this behavior working. I've tried passing multiple batch dimensions, using a MultitaskKernel, etc, but nothing has worked. I can just reimplement the following GPFlow function in PyTorch

 def square_distance(X, X2):
    """
    Returns ||X - X2ᵀ||²
    Due to the implementation and floating-point imprecision, the
    result may actually be very slightly negative for entries very
    close to each other.

    This function can deal with leading dimensions in X and X2.
    In the sample case, where X and X2 are both 2 dimensional,
    for example, X is [N, D] and X2 is [M, D], then a tensor of shape
    [N, M] is returned. If X is [N1, S1, D] and X2 is [N2, S2, D]
    then the output will be [N1, S1, N2, S2].
    """
    if X2 is None:
        Xs = tf.reduce_sum(tf.square(X), axis=-1, keepdims=True)
        dist = -2 * tf.matmul(X, X, transpose_b=True)
        dist += Xs + tf.linalg.adjoint(Xs)
        return dist
    Xs = tf.reduce_sum(tf.square(X), axis=-1)
    X2s = tf.reduce_sum(tf.square(X2), axis=-1)
    dist = -2 * tf.tensordot(X, X2, [[-1], [-1]])
    dist += broadcasting_elementwise(tf.add, Xs, X2s)
    return dist

But I would rather use the existing RBFKernel.

Is there a way to use the existing RBFKernel, or should I write this function myself?

tmuntianu avatar Aug 15 '20 22:08 tmuntianu

@tmuntianu Just to confirm, what you basically want is a matrix D so that D[i, j, k, p] gives sq_dist(x1[i, j, :], x2[k, p, :]) right? Or I guess alternatively, the full kernel matrix so that K[i, j, k, p] = exp(-D[i, j, k, p]/\sigma)?

jacobrgardner avatar Aug 17 '20 17:08 jacobrgardner

If so, and you indeed want N and P to both be separate batch dimensions, RBFKernel can indeed provide this behavior, although the interpretation of the output is a little strange (e.g., you're going to have an N x P x N x P set of 1 x 1 kernel matrices):

# suppose N = 2, P = 3, patch_len = 5
# Idea: use singleton batch dimensions wherever we want broadcasting.
kern = RBFKernel(batch_shape=torch.Size([2, 3, 2, 3]))
x1 = torch.randn(2, 3, 1, 1, 1, 5)  # N x P x 1 x 1 x num_data x patch_len (num_data = 1 here)
x2 = torch.randn(1, 1, 2, 3, 1, 5)  # 1 x 1 x N x P x num_data x patch_len
output = kern(x1, x2)  # output will by N x P x N x P x 1 x 1

Is this what you are looking for?

jacobrgardner avatar Aug 17 '20 17:08 jacobrgardner

Yup, exactly what I was looking for! Thanks so much! Didn't realize you could broadcast over batch dimensions like that. I only have one quick follow-up question: is it possible to easily strip away the extra dimensions with LazyTensor API? or should I call .evaluate() and operate on that instead?

tmuntianu avatar Aug 20 '20 22:08 tmuntianu

Stripping away batch dimensions for LTs is pretty easy -- standard lt[0, :] etc calls should work pretty easily as should, for example, lt.squeeze(...) calls. In general, the LT interface tries to mimic the standard torch.tensor one as closely as possible

jacobrgardner avatar Aug 21 '20 01:08 jacobrgardner

I asked because I was getting some errors with expected sizes for the LazyTensors, but I just fixed it by subclassing LazyEvaluatedKernelTensor and defining a custom _size method. I just couldn't get it to work by just defining a new num_outputs_per_input.

Thanks again for all your help! I really appreciate it.

tmuntianu avatar Aug 21 '20 02:08 tmuntianu

@tmuntianu did you make the convolutional kernel work? I want to use one and don't want to have to move to GPflow...

Schobs avatar Nov 04 '22 14:11 Schobs