Fix CUDA GatherND batch dimension validation regression

Open Copilot opened this issue 8 months ago • 0 comments

Fixes a regression where GatherND operations would fail with CUDAExecutionProvider but work correctly with CPUExecutionProvider, causing the error:

gather_nd.cc:30 CheckBatchDimensionsMatch Batch dimensions differ at index 0: 1 != 3, tensor indices: 0, 1

Root Cause

The CUDA implementation had an additional CheckBatchDimensionsMatch validation that enforced strict matching of batch dimensions between input and indices tensors. This validation was not present in the CPU implementation, creating inconsistent behavior between execution providers.

Solution

Removed the overly restrictive batch dimension validation from the CUDA implementation to align with CPU behavior. The CPU implementation has been working correctly without this validation, demonstrating that it's safe to remove.

Changes

onnxruntime/core/providers/cuda/tensor/gather_nd.cc: Removed CheckBatchDimensionsMatch call that was causing the regression
onnxruntime/test/providers/cpu/tensor/gather_nd_op_test.cc: Added regression test GatherND_flexible_input_shapes_regression to prevent this issue from recurring

Testing

The added test case validates that GatherND works correctly with flexible input shapes when using the default batch_dims=0, ensuring this regression doesn't happen again.

Fixes #25053.

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Jun 14 '25 22:06 Copilot