KernelFunctions.jl `PeriodicKernel` does not work with AD (see issue for work-around)

Hi. I noticed that the PeriodicKernel is failling AD tests and I would like to implement the Mauna Loa GPR example. I tried looking for discussion regarding this issue in past Issues but there isn't much. Is the problem an instance of "it is doable but no one has had the time yet" or is it pending because of stuff that depends on AD packages? Thanks!

Oct 23 '21 11:10 david-vicente

I just reran the AD tests and it looks like using ForwardDiff or ReverseDiff is fine. However there are still some strong errors with Zygote. I think #386 should hopefully solve this problem.

Note that our PeriodicKernel is only equal to the ExpSineKernel from sci-kit learn for one dimension : vs

Oct 25 '21 08:10 theogf

Why isn't there a period length parameter as in gpytorch?

Oct 25 '21 12:10 david-vicente

Because this would just be the lengthscale. You could achieve the same result with:

p = [1.0, 2.0, 1.5]
k = with_lengthscale(PeriodicKernel(;r=rand(3)), p)

where p is the same as in the gpytorch definition

The r here, corresponds to the sqrt(lambda) in gpytorch

Oct 25 '21 14:10 theogf

Oh, I was independently trying to build the MaunaLoa example (I put the work-in-progress here: https://github.com/JuliaGaussianProcesses/EcosystemIntegrationTesting/blob/main/api_testing/mauna_loa_example/script.jl as a Pluto.jl notebook) and ran into the same bug.

Here's code to reproduce the bug:

using AbstractGPs
using ParameterHandling
function build_gp(th)
    return GP(th.s^2 * with_lengthscale(PeriodicKernel(; r=[th.l/2]), th.p))
end
th = (; s = positive(exp(0.0)), l=positive(exp(1.0)), p=positive(exp(0.0)))
thflat, unflatten = ParameterHandling.value_flatten(th)
x = rand(5)
y = rand(5)
function loss(th)
    f=build_gp(th)
    return -logpdf(f(x, 0.01), y)
end
loss_packed = loss ∘ unflatten
using Zygote
Zygote.gradient(loss_packed, thflat)

Nov 05 '21 13:11 st--

@david-vicente the following two lines give equivalent kernels (in 1D):

k1 = with_lengthscale(PeriodicKernel(; r=[wiggle_scale / 2]), period)
k2 = with_lengthscale(SqExponentialKernel(), wiggle_scale) ∘ PeriodicTransform(1/period)

and we might want to provide with_period(kernel, period) = kernel ∘ PeriodicTransform(1/period). AutoDiff works fine for the PeriodicTransform. Using the second form, you can also swap out the base kernel, e.g. for a Matern32Kernel:)

(Note that wiggle_scale is relative to a period, not relative to the actual input scale!)

Nov 05 '21 14:11 st--

Thank you @st--. I did look to the PeriodicTransform at the time, but it wasn't obvious how to use it to achieve equivalent kernels from the documentation. Maybe we should add

k1 = with_lengthscale(PeriodicKernel(; r=[wiggle_scale / 2]), period)
k2 = with_lengthscale(SqExponentialKernel(), wiggle_scale) ∘ PeriodicTransform(1/period)

to its documentation?

Nov 05 '21 14:11 david-vicente

@david-vicente I've just added the finished example to AbstractGPs: https://github.com/JuliaGaussianProcesses/AbstractGPs.jl/pull/240

Yes it'd be great to improve the documentation. Would you be up for making the change and opening a PR? :)

Nov 05 '21 14:11 st--

You could also add the with_periodic helper I suggested, if you wanted:)

Nov 05 '21 14:11 st--

Can we do the same for multi-dimensional inputs?

Nov 05 '21 21:11 david-vicente

Can we do the same for multi-dimensional inputs?

It would work in exactly the same way if we would make PeriodicTransform output complex numbers (using cispi and, on older Julia versions, cis).

Nov 09 '21 16:11 devmotion

Fixed by #531

Feb 10 '24 14:02 simsurace

KernelFunctions.jl KernelFunctions.jl copied to clipboard

`PeriodicKernel` does not work with AD (see issue for work-around)

KernelFunctions.jl
KernelFunctions.jl copied to clipboard