onnxruntime [CPU EP] GatherND crashes with division by zero when batch dimensions mismatch between input and indices

Issue description

Passing incompatible batch dimensions between the input and indices tensors (2 vs 3 in this example) should fail rather than crash.

gatherNdCrash.zip

{
    "op_type": "GatherND",
    "version": 12,
    "batch_dims": 1,
    "data": [[0,1,2],[10,11,12],[20,21,22]],
    "indices": [[1],[2]],
    "output": [1,7],
    "T": "float32",
}

Expected: Status failure Actual: Fatal division by zero.

Note passing 2 input batch dimensions work (as they both match), and passing 1 input batch dimension works too (ORT appears to either broadcast or clamp the input).

Stack:

>	onnxruntime.dll!onnxruntime::GatherNDBase::PrepareForCompute::__l2::<lambda>(__int64 slice_idx) Line 85	C++
 	onnxruntime.dll!onnxruntime::GatherNDBase::PrepareForCompute::__l2::<lambda>(__int64 first, __int64 last) Line 111	C++
 	onnxruntime.dll!std::invoke<void <lambda>(__int64, __int64) &,__int64,__int64>(onnxruntime::GatherNDBase::PrepareForCompute::__l2::void <lambda>(__int64, __int64) & _Obj, __int64 && _Arg1, __int64 && <_Args2_0>) Line 1601	C++
 	onnxruntime.dll!std::_Invoker_ret<void>::_Call<void <lambda>(__int64, __int64) &,__int64,__int64>(onnxruntime::GatherNDBase::PrepareForCompute::__l2::void <lambda>(__int64, __int64) & _Func, __int64 && <_Vals_0>, __int64 && <_Vals_1>) Line 661	C++
 	onnxruntime.dll!std::_Func_impl_no_alloc<void <lambda>(__int64, __int64),void,__int64,__int64>::_Do_call(__int64 && <_Args_0>, __int64 && <_Args_1>) Line 821	C++
 	onnxruntime.dll!std::_Func_class<void,__int64,__int64>::operator()(__int64 <_Args_0>, __int64 <_Args_1>) Line 862	C++
 	onnxruntime.dll!onnxruntime::concurrency::ThreadPool::ParallelFor(__int64 n, const onnxruntime::TensorOpCost & c, const std::function<void __cdecl(__int64,__int64)> & f) Line 622	C++
 	onnxruntime.dll!onnxruntime::concurrency::ThreadPool::TryParallelFor(onnxruntime::concurrency::ThreadPool * tp, __int64 total, const onnxruntime::TensorOpCost & cost_per_unit, const std::function<void __cdecl(__int64,__int64)> & fn) Line 704	C++
 	onnxruntime.dll!onnxruntime::concurrency::ThreadPool::TryParallelFor(onnxruntime::concurrency::ThreadPool * tp, __int64 total, double cost_per_unit, const std::function<void __cdecl(__int64,__int64)> & fn) Line 252	C++
 	onnxruntime.dll!onnxruntime::GatherNDBase::PrepareForCompute<__int64>(const onnxruntime::TensorShape & input_shape, const onnxruntime::Tensor * indices_tensor, const __int64 bytes_per_value, onnxruntime::GatherNDBase::Prepare & p, onnxruntime::concurrency::ThreadPool * tp) Line 106	C++
 	onnxruntime.dll!onnxruntime::GatherND::Compute(onnxruntime::OpKernelContext * context) Line 171	C++

To reproduce

onnxruntime_perf_test.exe -I -r 1 -e cpu gatherNdCrash.onnx

Urgency

Not blocking, but should add to ORT fuzzing test cases, as embedding an ONNX model in another document could crash the user process. Can validate untrusted input before passing to ORT backend.

Platform

Windows

OS Version

Windows 11

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

e76bd2f5e98dda71b96e93d23ca275ca8a3eec47

ONNX Runtime API

C++

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Feb 27 '25 00:02 fdwr

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

Apr 03 '25 15:04 github-actions[bot]

If I have time ⏳, I'll do the minimum fix myself to at least return a bad Status.

Dec 11 '25 00:12 fdwr