ray icon indicating copy to clipboard operation
ray copied to clipboard

[RLlib] - Fix numerical overflow in gradient clipping for (many) large gradients

Open simonsays1980 opened this issue 9 months ago • 0 comments

Why are these changes needed?

Large gradients and many of these could lead to numerical overflow when computing their l2-norm in torch_utils.clip_gradients (using the "global_norm"). This is counterproductive as a user wants to clip such gradients and instead runs into numerical overflow because of clipping gradients.

This PR proposes small changes to turn inf and neginf values returned from norms to 10e8 and -10e8, respectively. This does not harm gradients themselves (if these for example were already inf/neginf b/c we clip gradients by multiplication and not overriding values).

Related issue number

Checks

  • [x] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • [x] I've run scripts/format.sh to lint the changes in this PR.
  • [x] I've included any doc changes needed for https://docs.ray.io/en/master/.
    • [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in doc/source/tune/api/ under the corresponding .rst file.
  • [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • [x] Unit tests
    • [x] Release tests
    • [ ] This PR is not tested :(

simonsays1980 avatar Apr 30 '24 13:04 simonsays1980