vision [CUDA][Suggestion] ROI Pool half-precision build error due to ambiguous comparison

🐛 Describe the bug

When building the ROI Pool CUDA kernel on Windows + MSVC + CUDA, compilation fails if AT_DISPATCH_FLOATING_TYPES_AND_HALF instantiates with T=half. The comparison operator becomes ambiguous and MSVC reports a build error.

Minimal Repro Example

// roi_pool_forward_kernel_impl excerpt
if (offset_input[input_index] > maxval) {
    maxval = offset_input[input_index];
    maxidx = input_index;
}

With T=half, this line fails to compile on MSVC.

Environment

OS: Windows 10/11

Compiler: MSVC 19.x (Visual Studio 2022)

CUDA: 12.x

PyTorch / torchvision: built from source (latest main)

Proposed Fix

Following PyTorch conventions, it may be preferable to compare in an accumulation type:

using acc_t = at::acc_type<T, /*is_cuda=*/true>;
acc_t v  = static_cast<acc_t>(offset_input[input_index]);
acc_t mv = static_cast<acc_t>(maxval);

if (v > mv) {
    maxval = offset_input[input_index];
    maxidx = input_index;
}

Also, initialize maxval in a type-consistent way:

T maxval = is_empty ? T(0) : std::numeric_limits<T>::lowest();

This avoids the MSVC ambiguity and preserves precision across float/double/half.

###Suggestion

Would it make sense to update the ROI Pool kernel accordingly? I am happy to prepare a PR if maintainers agree with this direction.

Note

I’m a junior-level software engineer and this is my first time submitting a report here. If I missed any guidelines or phrased things imperfectly, please kindly let me know. Thank you for your understanding.

Versions

https://github.com/pytorch/vision/blob/main/torchvision/csrc/ops/cuda/roi_pool_kernel.cu

Oct 18 '25 08:10 kimchioverfit

Hey @kimchioverfit, thanks for posting. Sorry for the late reply. Do you still have the issue. Can you use one of the already compiled wheels from PyPI (https://pypi.org/project/torchvision/#files) to unblock you?

Nov 13 '25 15:11 AntoineSimoulin

Hi @AntoineSimoulin, thanks for the follow-up!

I tested the model with the official PyPI wheels:

torch 2.8.0+cu129
torchvision 0.23.0

With the wheel setup, everything works fine — the TorchScript model loads and runs without any issues.

In my C++ setup I also used the matching libtorch 2.8.0+cu129 build (the same version as the Python wheels).

So the issue does not reproduce with the official wheels,

and only appears in the libtorch + manually-built torchvision csrc configuration on Windows.

Happy to share the exact snippet or patch if that helps.

Nov 14 '25 03:11 kimchioverfit