andrewdipper
Results
3
issues of
andrewdipper
Hi, I think the use of running_mean and running_var during training time in BatchNorm causes training instability and increased learning_rate sensitivity. With momentum low (say 0.5) the layer works fine...
This is in reference to issue [659](https://github.com/patrick-kidger/equinox/issues/659). I modified BatchNorm to have two approaches `"batch"` and `"ema"`. `"batch"` just uses the batch statistics during training time. If approach is not...
-Change blackjax sampling to only retain the relevant sampling info - reduces sampling memory requirements -Change `_postprocess_samples` to reuse the input arrays - reduces postprocessing memory requirements ## Description By...
maintenance
jax