returnn
returnn copied to clipboard
PyTorch collect model statistics
We should collect some statistics (maybe optionally, configurable) (maybe only every N steps if too costly otherwise).
Of:
- weights
- activations
- gradients of weights
- gradients of activations
- gradients of inputs
- loss (per step; or final on train, dev, devtrain)
Kind of statistics:
- mean, stddev (var)
- min, max, median, other percentiles
- all of the above but for the abs value as well
- all of the above but using momentum (e.g. exp moving average, different decay values)
Also see #1446.