Yukio Siraichi
Yukio Siraichi
After further investigation, I found out the issue is due to a combination of 2 factors: - The model, as well as the example inputs, are converted to `float16` -...
Apparently, after doing (1), I am getting another error: ```python File "/lib/python3.8/site-packages/detectron2/modeling/proposal_generator/proposal_utils.py", line 121, in find_top_rpn_proposals keep = batched_nms(boxes.tensor, scores_per_img, lvl, nms_thresh) File "/lib/python3.8/site-packages/detectron2/layers/nms.py", line 20, in batched_nms return box_ops.batched_nms(boxes.float(),...
I see. So, maybe a solution is to pass `--precision fp32` when instantiating the benchmark, while having `XLA_USE_FP16` set. What do you think?
This issue was temporarily fixed by #6389. #6404 details a better fix to this upcasting problem. One of them being the actual problem description on #6403.
Apparently, this issue was not due to conversion issues (https://github.com/pytorch/pytorch/issues/115792) as we once thought, but it's a real problem (more details [in this comment](https://github.com/pytorch/xla/issues/6336#issuecomment-1902677834)).
@miladm @JackCaoG Here's what I found when looking into this issue (`nms` fallbacking to the CPU kernel): even though there's an [implementation of `nms` inside PyTorch/XLA](https://github.com/pytorch/xla/blob/92bb381af277140a2d5fe8af4cd371a3c9c5c2d1/torch_xla/csrc/init_python_bindings.cpp#L649-L669), it appears that the...
@JackCaoG While the solution in [this comment](https://github.com/pytorch/xla/issues/6336#issuecomment-1989716090) works, I thought it would make more sense to implement a `CompositeExplicitAutograd` version on TorchVision, directly. What do you think?
The difference is that it would be decomposable with, hopefully, already supported operations. That said, I'm thinking on the following plan: - Make an XLA kernel for `nms` on PyTorch/XLA...
@JackCaoG Question: how important is it to keep the old behavior? - Current `nms` signature: ```python nms(boxes, scores, score_threshold, iou_threshold, output_size) ``` - TorchVision `nms` signature: ```python nms(boxes, scores, iou_threshold)...
So, can we kill it, in favor of the TorchVision variant?