pytorch
pytorch copied to clipboard
[doc] Add documentation for division by zero behavior in autograd
Fixes #128796
This PR adds documentation about the behavior of division by zero operations in PyTorch's autograd system. The documentation explains:
- How division by zero produces
infvalues following IEEE-754 floating point arithmetic - How autograd handles these cases and why masking after division can lead to
nangradients - Provides concrete examples showing the issue
- Recommends two solutions:
- Masking before division
- Using MaskedTensor (experimental API)
The documentation is added to the autograd notes section, making it easily discoverable for users who encounter this common issue.
This addresses the original issue #128796 which requested better documentation of this behavior to help users avoid common pitfalls when dealing with division by zero in their models.
dditional changes:
- Fixed formatting consistency by replacing curly apostrophes with straight apostrophes in the existing documentation
cc @svekars @sekyondaMeta @AlannaBurke @ezyang @albanD @gqchen @nikitaved @soulitzer @Varal7 @xmfan
:link: Helpful Links
:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/155987
- :page_facing_up: Preview Python docs built from this PR
- :page_facing_up: Preview C++ docs built from this PR
- :question: Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours
Note: Links to docs will display an error until the docs builds have been completed.
:white_check_mark: No Failures
As of commit 6dd35a9f991e04efbeca144f01c2e4f80e969561 with merge base 670dab6c630552b32189911f22896ec453e55ab7 ():
:green_heart: Looks good so far! There are no failures yet. :green_heart:
This comment was automatically generated by Dr. CI and updates every 15 minutes.
Didn't find following labels among repository labels: release notes: documentation
@pytorchbot label "topic: not user facing"
All required labels are set and checks are passing. CI is now waiting for maintainer approval.
Let me know if anything else is needed!
cc @svekars @sekyondaMeta @AlannaBurke
A bit related:
- https://github.com/pytorch/pytorch/issues/50122
@pytorchbot merge
Merge started
Your change will be merged once all checks pass (ETA 0-4 Hours).
Learn more about merging in the wiki.
Questions? Feedback? Please reach out to the PyTorch DevX TeamAdvanced Debugging
Check the merge workflow status
here
Maybe same mitigations / workarounds here would help for the notorious torch.where usecase:
- https://github.com/pytorch/pytorch/issues/156212
I think people also did custom autograd function or hooks to patch up the gradient or output...
I think at least docs should provide some copy-pasteable workaround