probability icon indicating copy to clipboard operation
probability copied to clipboard

Variational Gaussian process loss- possible math error

Open willtownes opened this issue 3 years ago • 1 comments

In the documentation for variational Gaussian process applied to minibatches (https://github.com/tensorflow/probability/blob/v0.12.1/tensorflow_probability/python/distributions/variational_gaussian_process.py#L572), the KL term is rescaled by batch_size/num_training_points_. I assume the reconstruction error term (expected log-likelihood) is not scaled and represents a sum over all data points in the minibatch. My understanding is the unbiased estimator for the full-data variational loss should instead be given -(num_training_points_/batch_size)*reconstruction_error + KL_term (or, on a per-observation basis, this could be divided by the total observations constant: -(1/batch_size)*reconstruction_error + (1/num_training_points_)*KL_term). Otherwise, if the batch size is not constant across minibatches, the estimator will be biased. If this is true, perhaps the weight should be on the reconstruction error term instead of the KL term in the variational_loss. Please let me know if I am missing something.

willtownes avatar Apr 13 '21 20:04 willtownes

I came across this post trying to figure out minibatch reweighting for the general purpose tfp.vi routines. I think you are correct that you would get an incorrect result if you have unequal batch sizes, if you are dividing by the actual batch size. The contribution per observation you want in the elbo should be

$$ \left(\frac{1}{N}D_{KL}(q(\theta|\xi)|P(\theta)) - \mathbb{E}_q \log P(D_n| \theta) \right) $$

so your batch loss should be a partial sum of these terms. If you divide by a fixed batch size, regardless of the actual batch size then you are just rescaling the objective so that should be ok.

jcalifornia avatar May 26 '22 17:05 jcalifornia