KernelFunctions.jl icon indicating copy to clipboard operation
KernelFunctions.jl copied to clipboard

Add `with_period` convenience function and a example to `PeriodicTransform` docstring

Open david-vicente opened this issue 3 years ago • 9 comments

This PR follows the suggestion in this comment As it is with_period only works with scalars, which contrasts with with_lengthscale that uses ARDTransform to deal with vector inputs. Any ideas on how should we deal with this?

david-vicente avatar Nov 06 '21 00:11 david-vicente

Codecov Report

Merging #401 (928b9ec) into master (33d64d1) will decrease coverage by 10.33%. The diff coverage is 0.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #401       +/-   ##
===========================================
- Coverage   89.23%   78.89%   -10.34%     
===========================================
  Files          52       53        +1     
  Lines        1198     1199        +1     
===========================================
- Hits         1069      946      -123     
- Misses        129      253      +124     
Impacted Files Coverage Δ
src/transform/periodic_transform.jl 40.00% <ø> (-10.00%) :arrow_down:
src/transform/with_period.jl 0.00% <0.00%> (ø)
src/matrix/kernelkroneckermat.jl 0.00% <0.00%> (-100.00%) :arrow_down:
src/approximations/nystrom.jl 0.00% <0.00%> (-91.90%) :arrow_down:
src/matrix/kernelpdmat.jl 0.00% <0.00%> (-72.73%) :arrow_down:
src/distances/delta.jl 71.42% <0.00%> (-28.58%) :arrow_down:
src/chainrules.jl 51.85% <0.00%> (-24.70%) :arrow_down:
src/generic.jl 80.00% <0.00%> (-20.00%) :arrow_down:
src/distances/dotproduct.jl 60.00% <0.00%> (-20.00%) :arrow_down:
src/utils.jl 64.78% <0.00%> (-19.72%) :arrow_down:
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 33d64d1...928b9ec. Read the comment docs.

codecov[bot] avatar Nov 06 '21 00:11 codecov[bot]

One of the motivations of with_lengthscale is that it provides a unified API for ScaleTransform and ARDTransform (although one can save allocations by using ARDTransform directly).

I thought that with_lengthscale was added because the convention when dealing with kernels is dividing the lengscale and ScaleTransform is a multiplication. The same happens with the period/frequency here. But I agree that adding a lot of different ways to achieve the same things ends up polluting the API.

david-vicente avatar Nov 08 '21 11:11 david-vicente

My motivation for introducing with_lengthscale was also to not have to think in terms of inverse lengthscale. And when constructing a periodic kernel, I'd rather just write with_period(kernel, period) than kernel \circ<tab> PeriodicTransform(1/period).

add a convenience constructor PeriodKernel(r::Real)

I'm not quite sure what you mean by that, and how it's different from with_period? how would you pass in the base kernel?

What should the meaning of periodicity be in more than one dimension? Coordinate-wise (i.e., equivalent to a product of kernels operating on one of the dimensions each)?

st-- avatar Nov 09 '21 08:11 st--

So it seems the main motivation is to introduce a different parametrization of PeriodicTransform? Then we could just define

function PeriodicTransform(; period=nothing, frequency=period === nothing ? 1.0 : inv(period))
    return PeriodicTransform{typeof(frequency)}(frequency)
end

I am mainly questioning if it is worth adding a new function with_period just to avoid having to write inv explicitly. Maybe in general we could add more keyword argument based constructors of transforms, similar to the kernels, to make it clearer what the individual arguments mean and to allow different parameterizations.

devmotion avatar Nov 09 '21 08:11 devmotion

Hmm, having looked at the code again, I'm wondering about what the right way for defining periodic kernels actually is. PeriodicTransform only works on 1D inputs, and maps each Real to two values. (So the base kernel needs to handle 2D inputs correctly.) For distance-based kernels, the combination of Euclidean metric & PeriodicTransform can be done in closed form - this is effectively what the Sinus "distance" does, so in principle we could achieve the same by constructing e.g. SqExponentialKernel(; metric=Sinus()) except the Sinus metric doesn't do the sqrt(). Which is why PeriodicKernel is basically same as SqExponential, but without the square. And then it's kinda confusing that the period is set with the lengthscale transform, and the lengthscale is set as an argument to the period...

st-- avatar Nov 09 '21 08:11 st--

That was quite rambly. So in bullet points:

  • it'd be great if we could construct all sorts of periodic kernels (based on any kernel supporting an Euclidean metric?) - PeriodicTransform does that (I think?)
  • it'd be great if we could avoid the intermediate ℝ → ℝ ² conversion required by PeriodicTransform - PeriodicKernel does this (but only for SqExponential base kernel)
  • it'd be great if we could directly specify the period and lengthscale in ways that make sense when just reading code (PeriodicTransform(; period=...) is good, with_lengthscale(PeriodicKernel(), period) not so good)

st-- avatar Nov 09 '21 08:11 st--

PeriodicTransform only works on 1D inputs, and maps each Real to two values.

This is the main reason why I still think that PeriodicTransform should map to complex numbers with cis or cispi (only available in more recent Julia versions). Then the "outer" dimensionality and structure would not be changed.

devmotion avatar Nov 09 '21 09:11 devmotion

This is the main reason why I still think that PeriodicTransform should map to complex numbers with cis or cispi (only available in more recent Julia versions). Then the "outer" dimensionality and structure would not be changed.

Then the base kernel would have to handle complex numbers, how would that affect speed e.g. when trying to run it on the GPU? It seems like it'd still be more extra effort than just computing the "equivalent distance" for the base kernel, no?

st-- avatar Nov 09 '21 15:11 st--

Then the base kernel would have to handle complex numbers, how would that affect speed e.g. when trying to run it on the GPU?

CuArrays of Complex{Float32} should be fine, in the memory every element is just like two Float32. Also many kernels should be able to handle complex inputs, and some others could be fixed easily (e.g., instead of dot we would use realdot from RealDot.jl for the dot product in the exponentiated kernel).

It seems like it'd still be more extra effort than just computing the "equivalent distance" for the base kernel, no?

You don't necessarily have to or want to use the PeriodicTransform in all cases. It still would make it more usable, in my opinion, in cases where you want to use the transform, in particular when dealing with multi-dimensional inputs.

devmotion avatar Nov 09 '21 15:11 devmotion