bayesian-torch Inconsistent use of mean & sum when calculating KL divergence?

Inconsistent use of mean & sum when calculating KL divergence?

Open profPlum opened this issue 10 months ago • 0 comments

There is a mean taken inside BaseVariationalLayer_.kl_div(). But later a sum is used inside get_kl_loss() & when reducing the KL loss of a layer's bias & weights (e.g. inside Conv2dReparameterization.kl_loss()).

I'm wondering if there is mathematical justification for this? Why take the mean of the individual weight KL divergences only to later sum across layers?

Apr 15 '24 18:04 profPlum

bayesian-torch bayesian-torch copied to clipboard

Inconsistent use of mean & sum when calculating KL divergence?

bayesian-torch
bayesian-torch copied to clipboard