[numeric][cpu]: numeric error for ONNX Gather operator element at index 200 (0.420379) does not match the expected (0.642927);
What happened?
For the given IR
module {
func.func @main(%arg0: !torch.vtensor<[10,200],f32>, %arg1: !torch.vtensor<[35,1],si64>) -> !torch.vtensor<[35,1,200],f32> attributes {torch.onnx_meta.ir_version = 10 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.producer_name = "", torch.onnx_meta.producer_version = ""} {
%none = torch.constant.none
%0 = torch.operator "onnx.Gather"(%arg0, %arg1) : (!torch.vtensor<[10,200],f32>, !torch.vtensor<[35,1],si64>) -> !torch.vtensor<[35,1,200],f32>
return %0 : !torch.vtensor<[35,1,200],f32>
}
}
We are seeing numeric mismatch
IREE version: IREE compiler version 20240819.990 @ aeda14995f16ed1302db616adf0c03acf80f27ee LLVM version 20.0.0git
Steps to reproduce your issue
Command to reproduce the issue:
iree-compile model.torch_onnx.mlir --iree-hal-target-backends=llvm-cpu -o out.vmfb --iree-input-demote-i64-to-i32
iree-run-module --module=out.vmfb --device="local-task" --input="[email protected]" --input="[email protected]" --expected_output="35x1x200xf32=@golden_output.0.bin"
This issue is coming due to presence of --iree-input-demote-i64-to-i32. If I remove this then I am 100% match golden_output.0.bin.txt input.0.bin.txt input.1.bin.txt
What component(s) does this issue relate to?
Runtime
Version information
No response
Additional context
No response
There might be a codegen issue here, but fact that not dropping to i32 makes the error go away seems to suggest this is not a codegen issue. It is in general not "safe" to truncate fully like this, but can be done if we know the inputs are within the 32 bit range.
@lialan start by seeing if there is any IR difference between the 32-bit and 64-bit compilation paths for this example.
They both generate same structure of LLVM IR, except: In the demote path, generated IR contains a load of i32:
%60 = llvm.getelementptr %31[%59] : (!llvm.ptr, i64) -> !llvm.ptr, i32
while in the normal path:
%59 = llvm.getelementptr %30[%58] : (!llvm.ptr, i64) -> !llvm.ptr, i64
In both paths, %31/%30 directly come from ABI:
%29 = llvm.extractvalue %28[10] : !llvm.struct<"iree_hal_executable_dispatch_state_v0_t", (i32, i32, i16, i16, i32, i32, i16, i8, i8, ptr, ptr, ptr)>
%30 = llvm.getelementptr %29[1] : (!llvm.ptr) -> !llvm.ptr, !llvm.ptr
%31 = llvm.load %30 : !llvm.ptr -> !llvm.ptr
vs
%28 = llvm.extractvalue %27[10] : !llvm.struct<"iree_hal_executable_dispatch_state_v0_t", (i32, i32, i16, i16, i32, i32, i16, i8, i8, ptr, ptr, ptr)>
%29 = llvm.getelementptr %28[1] : (!llvm.ptr) -> !llvm.ptr, !llvm.ptr
%30 = llvm.load %29 : !llvm.ptr -> !llvm.ptr
Suspect this caused the issue.
Can you post the two IRs
Attaching dumped IR files.
This one enables --iree-input-demote-i64-to-i32 which causes value not matching:
demote_to_i32.mlir.txt
This one does not have the option and works fine: no_demote_to_i32.mlir.txt
Looking at the IR, it doesnt look like a compilation failure. I dont know what the values of inputs you are sending in here. Should make sure that it is safe to demote from i64 to i32. At this point, I dont really see a codegen issue.
@pdhirajkumarprasad up signalling. This doesn't look like a compiler error. Could we close it?
Model works fine without the flag