Unnecessary FP64 cast for `padding_idx` in `EmbeddingDenseBackward`
torch_xla/csrc/tensor_ops.cpp #L232 converts an int64_t padding_idx to double.
This introduces FP64 ops even though indices_rank1 is always int32 or int64, while the upstream ATen implementation keeps everything in integer space.
Questions
- Was the double cast introduced for a specific historical or hardware reason?
- How is this handled on devices without native FP64 support (e.g. TPU)?
Proposed fix
- Keep
padding_idxasint64_t, or - Cast it to the same dtype as
indices_rank1before comparison.
Either option would avoid unnecessary FP64 operations and align with ATen’s behavior.
Thanks for taking a look!
Thank you for filing this issue.
Unfortunately, I have no idea why we cast padding_idx.
@bhavya01 @tengyifei @asuhan Any thoughts on this?
I don't know the reason for casting padding_idx either. On the surface, it looks unnecessary but I am not sure if it was done to avoid some XLA compiler error. @unterumarmung Do you want to create a PR to test removing the cast?
@bhavya01 I've opened PR #9406. Since the fix is trivial, I didn’t test it locally — as a first-time contributor, I don’t yet have a local development environment set up. I’m relying on CI to validate the change and hope that’s acceptable. If additional tests are needed, I’m happy to add them. Also, I believe CI needs to be explicitly enabled for first-time contributors — could you help with that?