[compile][rocm]: error: 'llvm.fptrunc' op result #0 must be floating point LLVM type or LLVM dialect-compatible vector of floating point LLVM type, but got 'i8'
What happened?
for the given IR
module @module {
util.global private @"__auto.model.layers.31.self_attn.v_proj.qdq_output:rscale" = #stream.parameter.named<"model"::"model.layers.31.self_attn.v_proj.qdq_output:rscale"> : tensor<f16>
func.func @prefill_bs2(%arg0: !torch.vtensor<[2,?],si64>, %arg1: !torch.vtensor<[2],si64>, %arg2: !torch.vtensor<[2,?],si64>, %arg3: !torch.tensor<[?,4194304],f16>, %arg4: !torch.vtensor<[?,4096],f16>, %arg5: !torch.vtensor<[4096,32000],f16>, %arg6: !torch.vtensor<[2,?],si64>, %arg7:!torch.vtensor<[?,4194304],f16>, %arg8:!torch.vtensor<[2,?],si64>, %arg9:!torch.vtensor<[2,?,4096],f16>) -> !torch.vtensor<[2,?,4096],f16> attributes {torch.assume_strict_symbolic_shapes} {
%__auto.model.layers.31.self_attn.v_proj.qdq_output3Arscale = util.global.load @"__auto.model.layers.31.self_attn.v_proj.qdq_output:rscale" : tensor<f16>
%944 = torch_c.from_builtin_tensor %__auto.model.layers.31.self_attn.v_proj.qdq_output3Arscale : tensor<f16> -> !torch.vtensor<[],f16>
%int24_8273 = torch.constant.int 24
%9307 = torch.prims.convert_element_type %arg9, %int24_8273 : !torch.vtensor<[2,?,4096],f16>, !torch.int -> !torch.vtensor<[2,?,4096],f8E4M3FN>
%int5_8274 = torch.constant.int 5
%9308 = torch.prims.convert_element_type %9307, %int5_8274 : !torch.vtensor<[2,?,4096],f8E4M3FN>, !torch.int -> !torch.vtensor<[2,?,4096],f16>
%9309 = torch.aten.mul.Tensor %9308, %944 : !torch.vtensor<[2,?,4096],f16>, !torch.vtensor<[],f16> -> !torch.vtensor<[2,?,4096],f16>
return %9309 : !torch.vtensor<[2,?,4096],f16>
}
}
getting error as
unknown>:0: error: 'llvm.fptrunc' op result #0 must be floating point LLVM type or LLVM dialect-compatible vector of floating point LLVM type, but got 'i8'
Steps to reproduce your issue
command to reproduce:
iree-compile --iree-hal-target-backends=rocm --iree-input-demote-i64-to-i32 --iree-hip-target=gfx942 small.mlir
dump.log with flag "--mlir-print-ir-after-all --mlir-print-ir-before-all --mlir-disable-threading --mlir-elide-elementsattrs-if-larger=4"
What component(s) does this issue relate to?
Compiler
Version information
No response
Additional context
No response
@andfau-amd could you please take a look
Okay, I'll try to find time to look at it today. FWIW I doubt this has any relation to my recent fptrunc-related changes, but I'm still happy to look at it.
Oh I am not saying it's related, but as I mentioned, these are bugs being filed as part of general compiler testing that is being ramped up.
@pdhirajkumarprasad there's no support for f8E4M3FN on the backend but exists for f8E4M3FNUZ. You can just change the type to this and it will compile successfully
@nithinsubbiah this small IR created from big model. We don't have plan to support this? if so then user need to modify it manually?
If a compiler target (https://github.com/iree-org/iree/tree/main/compiler/plugins/target) can't handle a data type, it should signal that with a clear error message early in the compilation pipeline.
@nithinsubbiah this small IR created from big model. We don't have plan to support this? if so then user need to modify it manually?
There might be a long winded path of supporting it, but it isnt easy, and probably not what users wants to begin with. So it would be better to change the input anyway.
If a compiler target (https://github.com/iree-org/iree/tree/main/compiler/plugins/target) can't handle a data type, it should signal that with a clear error message early in the compilation pipeline.
Good point. @nithinsubbiah can you add a check before ConvertToROCDL which checks for valid data type supported and throws an error of unsupported element type?
@nithinsubbiah , is the plan to modify the MLIR by running a script for now to work around and keep this issue open for real fix?
@nithinsubbiah , is the plan to modify the MLIR by running a script for now to work around and keep this issue open for real fix?
There is no real fix. We cant support the data type cause it is not supported on the hardware. We can error out, but the fix is to fix the front end.