Improvement: add a "division by zero" check in chunked loss handling in kd_losses.py
Suggested improvement:
It would be good to add a "division by zero" check in chunked loss handling in kd_losses.py.
Context: This is based on the discussion in PR #2094.
Potential issue: ForwardKLWithChunkedOutputLoss does not check for division by zero, while the non-chunked version does.
Great catch! We'd definitely welcome a small PR for this if you want to do it, otherwise we can try to get to it soon.
I can make a change. @felipemello1 , let me know if you prefer to do it or have a change already.
That would be awesome. Thanks @insop :)
Closing this now that #2239 has landed