KernelFunctions.jl Sum of independent kernels

Following this discourse discussion. Currently, there is no building block to sum independent Kernels, analog to KernelTensorProduct but with addition instead of multiplication:

For inputs $x = (x_1,\dots,x_n)$ and $x' = (x_1',\dots,x_n')$, the independent sum of kernels $k_1, \dots, k_n$:

$$ k(x, x'; k_1, \dots k_n) = \sum_{i=1}^n k_i(x_i, x_i') $$

May 09 '23 12:05 martincornejo

It sounds like a reasonable composite to add, especially since the alternative is pretty ugly using SelectTransform, is there a standardized name for this kind of kernels? KernelTensorSum? KernelDimensionwiseSum ?

May 09 '23 12:05 theogf

I am willing to do a PR, but I'll need some guidance since it is my first contribution. My naive approach would be to create a new Kernel (I like the idea of KernelTensorSum) similar to KernelTensorProduct.

What are the requirements for a fully functional kernel that can be used in AbstractGPs? From the documentation I identify the following:

define struct + constructors (would an abstract type KernelTensor for both KernelTensorProduct and KernelTensorSum make sense?)
define kernel (basically adapting the following method: https://github.com/JuliaGaussianProcesses/KernelFunctions.jl/blob/ef6d4591b36194fca069d8bc7ae8c1e2ee288080/src/kernels/kerneltensorproduct.jl#L52C5-L58)
define (or reuse) dim method
define kernelmatrix method
pretty printing
tests
Am I missing something?

May 09 '23 12:05 martincornejo

Thanks for contributing, I think you got it all together!

I would not build a KernelTensor abstraction as I don't think we would get much out of it.

For the name we can still change it during the PR review time if some other arguments come up.

May 09 '23 13:05 theogf

What are the requirements for a fully functional kernel that can be used in AbstractGPs?

I guess you can mainly copy KernelTensorProduct and replace multiplication with addition.

would an abstract type KernelTensor for both KernelTensorProduct and KernelTensorSum make sense?

Not in an initial version IMO (and maybe not at all). I would add a separate type, similar to how we distinguish between KernelSum and KernelProduct.

May 09 '23 13:05 devmotion

One technical question. Is it always the case that if you compare the same input, the correlation should be 1?

julia> k = SqExponentialKernel();

julia> x = 0;

julia> k(x,x)
1.0

Independently adding the kernels results into the following behavior:

julia> k1 = SqExponentialKernel();

julia> k2 = ExponentialKernel();

julia> k = KernelTensorSum(k1, k2)
Tensor sum of 2 kernels:
        Squared Exponential Kernel (metric = Distances.Euclidean(0.0))
        Exponential Kernel (metric = Distances.Euclidean(0.0))

julia> x = zeros(2);

julia> k(x,x)
2.0

So, should the kernel take the mean instead of sum so the correlation is normalized?

For inputs $x = (x_1,\dots,x_n)$ and $x' = (x_1',\dots,x_n')$, the independent sum of kernels $k_1, \dots, k_n$:

$$ k(x, x'; k_1, \dots k_n) = \frac{1}{n} \sum_{i=1}^n k_i(x_i, x_i') $$

May 09 '23 14:05 martincornejo

No it does not have to be! I would not "normalize" cause that might be something unexpected from the user side. The scaling should be dealt with each kernel individually.

May 09 '23 15:05 theogf

Of course... Simply scaling a kernel would also mean k(x,x) != 1.0

May 09 '23 15:05 martincornejo

KernelFunctions.jl KernelFunctions.jl copied to clipboard

Sum of independent kernels

KernelFunctions.jl
KernelFunctions.jl copied to clipboard