AbstractGPs.jl icon indicating copy to clipboard operation
AbstractGPs.jl copied to clipboard

Unecessary computation for `FiniteGP`

Open cscherrer opened this issue 4 years ago • 15 comments

Say you're using GPs in a PPL. The setup might be something like

julia> using AbstractGPs

julia> X = randn(20, 1);

julia> ydist = GP(Matern32Kernel())(RowVecs(X));

Then in Soss or Turing, you might have y ~ ydist.

Digging into the AbstractGPs code, it looks like both rand and logpdf end up doing

m, C_mat = mean_and_cov(ydist)  # `f` in the code
C = cholesky(_symmetric(C_mat))

Since none of this depends on the actual data, it could be computed once and reused many times during inference. As it is, this adds lots of overhead at each iteration. Is there a reason not to do it this way?

cscherrer avatar Oct 20 '21 12:10 cscherrer

It's not necessary to compute it if you don't call rand or logpdf :shrug:

One approach that doesn't touch the structure of FiniteGP would be to add a method MvNormal(::FiniteGP) that returns the corresponding MvNormal distribution which could then be used instead of FiniteGP if you don't want to recompute the cholesky decomposition, e.g., if you call rand or logpdf multiple times.

devmotion avatar Oct 20 '21 12:10 devmotion

It's not necessary to compute it if you don't call rand or logpdf shrug

One approach that doesn't touch the structure of FiniteGP would be to add a method MvNormal(::FiniteGP) that returns the corresponding MvNormal distribution which could then be used instead of FiniteGP if you don't want to recompute the cholesky decomposition, e.g., if you call rand or logpdf multiple times.

Ooh, I like that. Maybe we have a @require in MeasureTheory that adds the MvNormal method. Then we can use the faster representation too, should work out great!

cscherrer avatar Oct 20 '21 12:10 cscherrer

It's not necessary to compute it if you don't call rand or logpdf shrug

Just to be sure I understand, what else would you do with a FiniteGP? I can think of kernel composition, but doesn't that come before this? Maybe there are other operations?

cscherrer avatar Oct 20 '21 12:10 cscherrer

E.g., it is not required if you compute the marginals. Or more generally just the mean and covariance matrix.

Maybe we have a @require in MeasureTheory that adds the MvNormal method

I don't think this should be done, this would be type piracy (and use Requires...). It belongs into AbstractGPs if we think it is a good idea.

devmotion avatar Oct 20 '21 12:10 devmotion

Is it clear that I'm talking about MeasureTheory.MvNormal? Not seeing the type piracy here. Though having AbstractGPs use MeasureTheory would be great if you're open to that

cscherrer avatar Oct 20 '21 12:10 cscherrer

Haha, my suggestion was Distributions.MvNormal since FiniteGP is already a subtype of Distributions.AbstractMvNormal :smile:

devmotion avatar Oct 20 '21 13:10 devmotion

Just to be sure I understand, what else would you do with a FiniteGP? I can think of kernel composition, but doesn't that come before this? Maybe there are other operations?

There are a number of scenarios in which ways to compute logpdf, rand, posterior etc don't utilise the covariance matrix directly. For example, TemporalGPs.jl never computes any covariance matrices, but implements logpdf, rand, posterior, etc.

willtebbutt avatar Oct 20 '21 13:10 willtebbutt

Haha, my suggestion was Distributions.MvNormal since FiniteGP is already a subtype of Distributions.AbstractMvNormal smile

Right, I'd want to get to the fast implementation in MultivariateMeasures.jl, and avoid type piracy :)

But come to think of it, maybe it should just be MeasureTheory.MvNormal(::Distributions.AbstractMvNormal). Then I wouldn't need Requires.

BTW, is there a way to avoid using PDMats? I'd rather allow positive semidefinite coavariance.

There are a number of scenarios in which ways to compute logpdf, rand, posterior etc don't utilise the covariance matrix directly. For example, TemporalGPs.jl never computes any covariance matrices, but implements logpdf, rand, posterior, etc.

Interesting, does it need to avoid using FiniteGP to do that?

cscherrer avatar Oct 20 '21 13:10 cscherrer

BTW, is there a way to avoid using PDMats? I'd rather allow positive semidefinite coavariance.

You can use PSDMat in https://github.com/invenia/PDMatsExtras.jl.

devmotion avatar Oct 20 '21 13:10 devmotion

Interesting, does it need to avoid using FiniteGP to do that?

Nah, it has precisely the same API as usual (a big motivation for the design). We just write specialisations like

logpdf(::FiniteGP{<:MySpecialGPType}, ::AbstractVector{<:Real})
rand(::AbstractRNG, ::FiniteGP{<:MySpecialGPType})
posterior(::FiniteGP{<:MySpecialGPType}, ::AbstractVector{<:Real})

etc.

willtebbutt avatar Oct 20 '21 15:10 willtebbutt

That makes sense, thanks. The "abstract" part of AbstractGPs wasn't really clear to me - how to overload, etc :)

cscherrer avatar Oct 20 '21 15:10 cscherrer

@willtebbutt wrote a really good introduction to the design in the docs but we should probably point to it much more in the README,

theogf avatar Oct 20 '21 15:10 theogf

I think I had seen that before, but it's been a while. I should definitely have another look

cscherrer avatar Oct 20 '21 15:10 cscherrer

In any case, should we add a convert(Distributions.MvNormal, ::FiniteGP)? I'd be happy with that:)

st-- avatar Oct 27 '21 13:10 st--

Oh, I completely forgot that I had made a draft locally during the discussion last week. I'll open a PR.

devmotion avatar Oct 27 '21 22:10 devmotion

I believe that this is basically sorted, so am going to close. Please re-open if you feel otherwise.

willtebbutt avatar Sep 15 '23 19:09 willtebbutt