gpytorch
gpytorch copied to clipboard
[Docs] Describe in documentation of Variational ELBO normalized by num_data
📚 Documentation/Examples
Hi, I am new to GPyTorch as I have just pivoted to using torch for my projects and enjoying the package. I am working on a project which includes a variational treatment of kernel hyperparameters in a similar spirit to Improving the Gaussian Process Sparse Spectrum Approximation by Representing Uncertainty in Frequency Inputs.
** Is there documentation missing? **
I think that the VariationalELBO mll computes the ELBO downscaled by the number of data points and this could be mentioned in the documentation. This can introduce some scaling issues when using the AddedLossTerm
class with other losses that do not include this by default.