Stheno.jl icon indicating copy to clipboard operation
Stheno.jl copied to clipboard

Numerical issues

Open MikeInnes opened this issue 5 years ago • 3 comments

Stheno is great and I'm enjoying it a lot so far.

In some cases I've found that it's easy to run into numerical issues when getting the logpdf, when the distance between x values is smaller than the length scale:

julia> using Stheno, LinearAlgebra

julia> f = GP(EQ(), GPC())
GP{Stheno.ZeroMean{Float64},EQ}(Stheno.ZeroMean{Float64}(), EQ(), 1, GPC(1))

julia> x = range(0, 1, length = 15)
0.0:0.02040816326530612:1.0

julia> cholesky(cov(f(x)))
ERROR: PosDefException: matrix is not positive definite; Cholesky factorization failed.

AFAIU the covariance matrix should be positive definite by construction since the GP kernel is. It seems like the problem may be that in this case the eigenvalues are too small – not actually zero, but small enough to confuse the cholesky routine. (Does that sound right?)

Is there any more direct or numerically stable way to get the GP's logpdf that could avoid this issue?

MikeInnes avatar Nov 20 '20 16:11 MikeInnes

Glad to hear that you're enjoying it Mike :)

It seems like the problem may be that in this case the eigenvalues are too small – not actually zero, but small enough to confuse the cholesky routine. (Does that sound right?)

This diagnosis is exactly right -- you wind up with numerically negative eigenvalues, and the Cholesky factorisation breaks down.

The standard thing to do is add a small amount of "jitter" to the covariance matrix C. So instead of computing

cholesky(C)

you compute

cholesky(C + 1e-9I)

or something like that.

This is sufficiently common that I made it possible to do this really easily in Stheno. You should interpret

f(x, 1e-9)

as "the multivariate Normal given by the sum of f at x and iid Gaussian noise with variance 1e-9". In my experience, it's fine to do this. So

cholesky(cov(f(x, 1e-9)))

should work just fine (hopefully).

willtebbutt avatar Nov 20 '20 16:11 willtebbutt

Aaaah, ok. I've actually used the additional noise term in a bunch of places, but hadn't at all connected this numerical issue to the lack of a noise term. You've saved me from a real rabbit hole with your quick response, thank you!

Feel free to close this, or alternatively we can consider it a docs issue (I might have some time to add a "troubleshooting" section at some point, which might be useful for beginners like myself).

MikeInnes avatar Nov 20 '20 17:11 MikeInnes

A docs issue would be an excellent idea, particularly if you have time to write it! Let's keep it open.

willtebbutt avatar Nov 20 '20 17:11 willtebbutt