ray
ray copied to clipboard
[RLlib] - Fix numerical overflow in gradient clipping for (many) large gradients
Why are these changes needed?
Large gradients and many of these could lead to numerical overflow when computing their l2-norm in torch_utils.clip_gradients
(using the "global_norm"). This is counterproductive as a user wants to clip such gradients and instead runs into numerical overflow because of clipping gradients.
This PR proposes small changes to turn inf
and neginf
values returned from norms to 10e8
and -10e8
, respectively. This does not harm gradients themselves (if these for example were already inf/neginf
b/c we clip gradients by multiplication and not overriding values).
Related issue number
Checks
- [x] I've signed off every commit(by using the -s flag, i.e.,
git commit -s
) in this PR. - [x] I've run
scripts/format.sh
to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I added a
method in Tune, I've added it in
doc/source/tune/api/
under the corresponding.rst
file.
- [ ] I've added any new APIs to the API Reference. For example, if I added a
method in Tune, I've added it in
- [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
- [x] Unit tests
- [x] Release tests
- [ ] This PR is not tested :(