iree [numeric][cpu]: numeric error for ONNX Gather operator element at index 200 (0.420379) does not match the expected (0.642927);

What happened?

For the given IR

module {
  func.func @main(%arg0: !torch.vtensor<[10,200],f32>, %arg1: !torch.vtensor<[35,1],si64>) -> !torch.vtensor<[35,1,200],f32> attributes {torch.onnx_meta.ir_version = 10 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.producer_name = "", torch.onnx_meta.producer_version = ""} {
    %none = torch.constant.none
    %0 = torch.operator "onnx.Gather"(%arg0, %arg1) : (!torch.vtensor<[10,200],f32>, !torch.vtensor<[35,1],si64>) -> !torch.vtensor<[35,1,200],f32> 
    return %0 : !torch.vtensor<[35,1,200],f32>
  }
}

We are seeing numeric mismatch

IREE version: IREE compiler version 20240819.990 @ aeda14995f16ed1302db616adf0c03acf80f27ee LLVM version 20.0.0git

Steps to reproduce your issue

Command to reproduce the issue:

iree-compile model.torch_onnx.mlir --iree-hal-target-backends=llvm-cpu -o out.vmfb --iree-input-demote-i64-to-i32
iree-run-module --module=out.vmfb --device="local-task" --input="[email protected]" --input="[email protected]"  --expected_output="35x1x200xf32=@golden_output.0.bin"

This issue is coming due to presence of --iree-input-demote-i64-to-i32. If I remove this then I am 100% match golden_output.0.bin.txt input.0.bin.txt input.1.bin.txt

What component(s) does this issue relate to?

Runtime

Version information

No response

Additional context

No response

Aug 19 '24 06:08 pdhirajkumarprasad

There might be a codegen issue here, but fact that not dropping to i32 makes the error go away seems to suggest this is not a codegen issue. It is in general not "safe" to truncate fully like this, but can be done if we know the inputs are within the 32 bit range.

@lialan start by seeing if there is any IR difference between the 32-bit and 64-bit compilation paths for this example.

Aug 20 '24 17:08 MaheshRavishankar

They both generate same structure of LLVM IR, except: In the demote path, generated IR contains a load of i32:

%60 = llvm.getelementptr %31[%59] : (!llvm.ptr, i64) -> !llvm.ptr, i32

while in the normal path:

%59 = llvm.getelementptr %30[%58] : (!llvm.ptr, i64) -> !llvm.ptr, i64

In both paths, %31/%30 directly come from ABI:

    %29 = llvm.extractvalue %28[10] : !llvm.struct<"iree_hal_executable_dispatch_state_v0_t", (i32, i32, i16, i16, i32, i32, i16, i8, i8, ptr, ptr, ptr)>
    %30 = llvm.getelementptr %29[1] : (!llvm.ptr) -> !llvm.ptr, !llvm.ptr
    %31 = llvm.load %30 : !llvm.ptr -> !llvm.ptr

vs

    %28 = llvm.extractvalue %27[10] : !llvm.struct<"iree_hal_executable_dispatch_state_v0_t", (i32, i32, i16, i16, i32, i32, i16, i8, i8, ptr, ptr, ptr)>
    %29 = llvm.getelementptr %28[1] : (!llvm.ptr) -> !llvm.ptr, !llvm.ptr
    %30 = llvm.load %29 : !llvm.ptr -> !llvm.ptr

Suspect this caused the issue.

Aug 21 '24 01:08 lialan

Can you post the two IRs

Aug 21 '24 03:08 MaheshRavishankar

Attaching dumped IR files.

This one enables --iree-input-demote-i64-to-i32 which causes value not matching: demote_to_i32.mlir.txt

This one does not have the option and works fine: no_demote_to_i32.mlir.txt

Aug 22 '24 03:08 lialan

Looking at the IR, it doesnt look like a compilation failure. I dont know what the values of inputs you are sending in here. Should make sure that it is safe to demote from i64 to i32. At this point, I dont really see a codegen issue.

Aug 22 '24 16:08 MaheshRavishankar

@pdhirajkumarprasad up signalling. This doesn't look like a compiler error. Could we close it?

Aug 28 '24 06:08 MaheshRavishankar

Model works fine without the flag

Sep 04 '24 04:09 pdhirajkumarprasad