Improve sampling from GP predictive posteriors.
In GaussianLikelihood#marginal the covaraince matrix is now a PsdSumLinearOperator
rather than an AddedDiagLinearOperatior. This change improves the samples from GP predictive posteriors.
Rather than applying a low-rank approximation to K + \sigma^2 I, the PsdSumLinearOperator
now only applies a low-rank approximation to K for sampling, and then adds on i.i.d. N(0, \sigma^2 I)
noise.
Technically, the noise_covar here can be arbitrary, right? I.e. in the general case this would be K + Sigma where Sigma is p.d. (either non-uniform noise levels, or potentially even a full covariance matrix if the observation noise is correlated) and things should still work, right?
@Balandat yes, noise_covar can be arbitrary!
Unfortunately, this PR is going to be slightly more challenging than I thought... (due to the special behavior we need for RFF kernel, etc.). It'll become easier once we merge #2342, so maybe its time to revive that thread