AbstractGPs.jl
AbstractGPs.jl copied to clipboard
Test VFE with the naive implementation
Summary
We currently lack any test to confirm the predictive distribution is matching what is prescribed by the original papers -
- VFE: M. K. Titsias. "Variational learning of inducing variables in sparse Gaussian processes". In: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics. 2009.
- DTC: M. Seeger, C. K. I. Williams and N. D. Lawrence. "Fast Forward Selection to Speed Up Sparse Gaussian Process Regression". In: Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics. 2003
This is an attempt at fixing that.
The predictive distribution for both DTC and VFE should be the same -- projected process (PP)? This PR is currently checking it against DTC's predictive distribution defined in Quinonero-Candela's Eq 20b
With these tests, we can identify a subjectively small but consistent discrepancy between my computationally naive test implementation and the package's implementation. Not sure if it is a bug in the test.
julia> inv(LinearAlgebra.Diagonal(1e-12 * ones(5))) *
kernelmatrix(k, x_test, u) *
Σ(x, u) *
kernelmatrix(k, u, x) *
y
5-element Vector{Float64}:
2.2822711905989306
1.619419315942234
2.4886061448883043
0.13716866592873223
-1.4776904851124237
julia> mean(f_approx_post, x_test)
5-element Vector{Float64}:
2.2632581110918366
1.620565927921467
2.540609342124296
0.16377806765055059
-1.6899264788845625
julia> kernelmatrix(k, x_test, x_test) - q(x_test, x_test) +
kernelmatrix(k, x_test, u) * Σ(x, u) * transpose(kernelmatrix(k, x_test, u))
5×5 Matrix{Float64}:
0.148276 0.0509091 -0.00913296 -0.045995 0.0149131
0.0506096 0.0490361 -0.00308661 -0.0382136 0.0114484
-0.00913856 -0.00307898 0.0389262 0.0023531 -0.000731468
-0.046091 -0.038105 0.00236565 0.131079 -0.058772
0.0148782 0.0114382 -0.00071384 -0.0587383 0.308568
julia> cov(f_approx_post, x_test)
5×5 Matrix{Float64}:
0.150214 0.0511708 -0.00932485 -0.0470627 0.0133652
0.0511708 0.0493259 -0.00309447 -0.0388128 0.0106505
-0.00932485 -0.00309447 0.0389497 0.00241128 -0.000594857
-0.0470627 -0.0388128 0.00241128 0.132985 -0.0552457
0.0133652 0.0106505 -0.000594857 -0.0552457 0.300714
With these tests, we can identify a subjectively small but consistent discrepancy between my computationally naive test implementation and the package's implementation. Not sure if it is a bug in the test.
I'm pretty sure that your mean calculation doesn't take the fact that the prior has non-zero mean into account. I think you need something like
@test map(sin, x_test) + inv(LinearAlgebra.Diagonal(1e-12 * ones(5))) *
kernelmatrix(k, x_test, u) *
Σ(x, u) *
kernelmatrix(k, u, x) *
(y - map(sin, x)) ≈ mean(f_approx_post, x_test)
instead.
As regards the covariance, note that the predictive covariance at locations other than the pseudo-inputs are different between the VFE and DTC.
Could you check that your expressions agree with what the package currently does if you check that x_test = z
.
I think they should also agree if you swap out VFE for DTC in the tests.
I would advise checking that what you've implemented for the covariance in your tests lines up with equation 6 in Titsias' 2009 paper -- a quick glance on my part suggests that they're not quite the same, but I've not checked it in detail.
@willtebbutt Also, are derivations for the current VFE/DTC predictive distribution internal code available somewhere? I still have your write-up which you gave me couple of years back while implementing these but that document don't seem to have any derivations.
Hmmm I actually don't think that we do have that lying around. Could you open an issue about it so that we don't forget it?
Hmmm I actually don't think that we do have that lying around. Could you open an issue about it so that we don't forget it?
Oh okay. Please let me know if you come across them. I was hoping to use those as a reference to implement other sparse techniques like FITC, etc.
In the new tests, could you separate out the terms, call it e.g. dtc_posterior_mean = ... # see (ref) (eq. X)
? that'd be really helpful to make it easier to follow :)
This appears to have gone stale. @sharanry please feel free to re-open if you wish to finish off.