profPlum

Results 13 issues of profPlum

Python version: 3.7.6 Pip version: 20.0.2 System: Linux, `uname -v` output --> `#53~18.04.1-Ubuntu SMP Thu Jul 15 11:32:10 UTC 2021` (this is either a debian or ubuntu system... probably ubuntu)...

There is a mean taken inside [BaseVariationalLayer_.kl_div()](https://github.com/IntelLabs/bayesian-torch/blob/main/bayesian_torch/layers/base_variational_layer.py). But later a sum is used inside [get_kl_loss()](https://github.com/IntelLabs/bayesian-torch/blob/main/bayesian_torch/models/dnn_to_bnn.py) & when reducing the KL loss of a layer's bias & weights (e.g. inside [Conv2dReparameterization.kl_loss()](https://github.com/IntelLabs/bayesian-torch/blob/main/bayesian_torch/layers/variational_layers/conv_variational.py))....

### Bug description I'm using FSDP and model checkpointing (default settings for both). My model has 254 million parameters. I'm not sure why but when I run Trainer.fit() it will...

bug
needs triage
ver: 2.3.x