KernelFunctions.jl
                                
                                 KernelFunctions.jl copied to clipboard
                                
                                    KernelFunctions.jl copied to clipboard
                            
                            
                            
                        Add `with_period` convenience function and a example to `PeriodicTransform` docstring
This PR follows the suggestion in this comment
As it is  with_period only works with scalars, which contrasts with with_lengthscale that uses ARDTransform to deal with vector inputs. Any ideas on how should we deal with this?
Codecov Report
Merging #401 (928b9ec) into master (33d64d1) will decrease coverage by
10.33%. The diff coverage is0.00%.
@@             Coverage Diff             @@
##           master     #401       +/-   ##
===========================================
- Coverage   89.23%   78.89%   -10.34%     
===========================================
  Files          52       53        +1     
  Lines        1198     1199        +1     
===========================================
- Hits         1069      946      -123     
- Misses        129      253      +124     
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/transform/periodic_transform.jl | 40.00% <ø> (-10.00%) | :arrow_down: | 
| src/transform/with_period.jl | 0.00% <0.00%> (ø) | |
| src/matrix/kernelkroneckermat.jl | 0.00% <0.00%> (-100.00%) | :arrow_down: | 
| src/approximations/nystrom.jl | 0.00% <0.00%> (-91.90%) | :arrow_down: | 
| src/matrix/kernelpdmat.jl | 0.00% <0.00%> (-72.73%) | :arrow_down: | 
| src/distances/delta.jl | 71.42% <0.00%> (-28.58%) | :arrow_down: | 
| src/chainrules.jl | 51.85% <0.00%> (-24.70%) | :arrow_down: | 
| src/generic.jl | 80.00% <0.00%> (-20.00%) | :arrow_down: | 
| src/distances/dotproduct.jl | 60.00% <0.00%> (-20.00%) | :arrow_down: | 
| src/utils.jl | 64.78% <0.00%> (-19.72%) | :arrow_down: | 
| ... and 6 more | 
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 33d64d1...928b9ec. Read the comment docs.
One of the motivations of
with_lengthscaleis that it provides a unified API forScaleTransformandARDTransform(although one can save allocations by usingARDTransformdirectly).
I thought that with_lengthscale was added because the convention when dealing with kernels is dividing the lengscale and ScaleTransform is a multiplication. The same happens with the period/frequency here. But I agree that adding a lot of different ways to achieve the same things ends up polluting the API.
My motivation for introducing with_lengthscale was also to not have to think in terms of inverse lengthscale.
And when constructing a periodic kernel, I'd rather just write with_period(kernel, period) than kernel \circ<tab> PeriodicTransform(1/period).
add a convenience constructor
PeriodKernel(r::Real)
I'm not quite sure what you mean by that, and how it's different from with_period? how would you pass in the base kernel?
What should the meaning of periodicity be in more than one dimension? Coordinate-wise (i.e., equivalent to a product of kernels operating on one of the dimensions each)?
So it seems the main motivation is to introduce a different parametrization of PeriodicTransform? Then we could just define
function PeriodicTransform(; period=nothing, frequency=period === nothing ? 1.0 : inv(period))
    return PeriodicTransform{typeof(frequency)}(frequency)
end
I am mainly questioning if it is worth adding a new function with_period just to avoid having to write inv explicitly. Maybe in general we could add more keyword argument based constructors of transforms, similar to the kernels, to make it clearer what the individual arguments mean and to allow different parameterizations.
Hmm, having looked at the code again, I'm wondering about what the right way for defining periodic kernels actually is. PeriodicTransform only works on 1D inputs, and maps each Real to two values. (So the base kernel needs to handle 2D inputs correctly.) For distance-based kernels, the combination of Euclidean metric & PeriodicTransform can be done in closed form - this is effectively what the Sinus "distance" does, so in principle we could achieve the same by constructing e.g. SqExponentialKernel(; metric=Sinus()) except the Sinus metric doesn't do the sqrt(). Which is why PeriodicKernel is basically same as SqExponential, but without the square. And then it's kinda confusing that the period is set with the lengthscale transform, and the lengthscale is set as an argument to the period...
That was quite rambly. So in bullet points:
- it'd be great if we could construct all sorts of periodic kernels (based on any kernel supporting an Euclidean metric?) - PeriodicTransformdoes that (I think?)
- it'd be great if we could avoid the intermediate ℝ → ℝ ² conversion required by PeriodicTransform-PeriodicKerneldoes this (but only for SqExponential base kernel)
- it'd be great if we could directly specify the period and lengthscale in ways that make sense when just reading code (PeriodicTransform(; period=...)is good,with_lengthscale(PeriodicKernel(), period)not so good)
PeriodicTransform only works on 1D inputs, and maps each Real to two values.
This is the main reason why I still think that PeriodicTransform should map to complex numbers with cis or cispi (only available in more recent Julia versions). Then the "outer" dimensionality and structure would not be changed.
This is the main reason why I still think that PeriodicTransform should map to complex numbers with
cisorcispi(only available in more recent Julia versions). Then the "outer" dimensionality and structure would not be changed.
Then the base kernel would have to handle complex numbers, how would that affect speed e.g. when trying to run it on the GPU? It seems like it'd still be more extra effort than just computing the "equivalent distance" for the base kernel, no?
Then the base kernel would have to handle complex numbers, how would that affect speed e.g. when trying to run it on the GPU?
CuArrays of Complex{Float32} should be fine, in the memory every element is just like two Float32. Also many kernels should be able to handle complex inputs, and some others could be fixed easily (e.g., instead of dot we would use realdot from RealDot.jl for the dot product in the exponentiated kernel).
It seems like it'd still be more extra effort than just computing the "equivalent distance" for the base kernel, no?
You don't necessarily have to or want to use the PeriodicTransform in all cases. It still would make it more usable, in my opinion, in cases where you want to use the transform, in particular when dealing with multi-dimensional inputs.