iree icon indicating copy to clipboard operation
iree copied to clipboard

rng_uniform lowering to vmvx has undefined behavior

Open GMNGeoffrey opened this issue 2 years ago • 6 comments

When running tests/e2e/xla_ops:check_vmvx_local-task_rng_uniform.mlir with ubsan internally, I'm hitting:

UndefinedBehaviorSanitizer: signed-integer-overflow runtime/src/iree/vm/ops.h:47

@kuhar, who I believe is looking into ubsan errors. That's the VM mul:

static inline int32_t vm_mul_i32(int32_t lhs, int32_t rhs) { return lhs * rhs; }

Seems like the lowering to vmvx for RNG uniform is not taking into account potential overflow

GMNGeoffrey avatar Aug 22 '22 18:08 GMNGeoffrey

I guess the question is whether we consider the behavior defined on the level of ops.h. If yes, we'd have to update almost all functions in ops.h; if not, the lowering needs to operate on wider types or insert some checks .

When running tests/e2e/xla_ops:check_vmvx_local-task_rng_uniform.mlir with ubsan internally, I'm hitting:

Can we reproduce this with CMake? I mean just the generated code that ends up calling vm_mul_i32, not the runtime ubsan warning.

kuhar avatar Aug 22 '22 19:08 kuhar

yeah, this is something that'd have to be fixed in the user program - we're seeing the warning at runtime pointing at the vm ops but the issue is in the user code we are running and not the op itself (kind of like how in a normal C program ubsan would point at the line of C and the user would fix that, not the x86 instruction produced by their compiler). aka garbage-in garbage-out :P

benvanik avatar Aug 22 '22 19:08 benvanik

The "user program" here is just a test calling mhlo.rng: https://github.com/iree-org/iree/blob/main/tests/e2e/xla_ops/rng_uniform.mlir. I don't see anything there that I would expect to cause an overflow.

Can we reproduce this with CMake? I mean just the generated code that ends up calling vm_mul_i32, not the runtime ubsan warning.

Yes this test runs with CMake in OSS also

GMNGeoffrey avatar Aug 22 '22 20:08 GMNGeoffrey

Unless you mean by "user program" you mean the generated code, in which case, yes, that's why I tagged it as being a vmvx codegen bug :-)

GMNGeoffrey avatar Aug 22 '22 20:08 GMNGeoffrey

I'm not sure it's vmvx-specific, just that vmvx is the only place we can see this happen with ubsan - if it's a multiply that needs a full 64-bit result and not being lowered to that then I'd suspect that's the case on all backends.

benvanik avatar Aug 22 '22 20:08 benvanik

LLVM ops have specified behavior with signless integer multiplication, MHLO doesn't specify and neither does arith (today). But MHLO ops get lowered to arith and then VMVX where UBSAN flags it.

We see this too in tosa_ops:check_vmvx_local-sync_microkernels_arithmetic_right_shift.mlir

We could have VMVX behave same as LLVM for these signless ops, look at changing the lowering, flag these functions to not check UBSAN.

jpienaar avatar Sep 19 '22 20:09 jpienaar

I think I hit this again: https://github.com/iree-org/iree/issues/11014. I'm a bit confused because it looks like the issue in the arithmetic shift is just that we're casting uint32_t to int32_t and expecting the result to be positive: https://github.com/iree-org/iree/blob/ab6ba22d2d530a989c1b2c0ee1a1437fd5053148/runtime/src/iree/builtins/ukernel/elementwise_impl.c.inc#L61

GMNGeoffrey avatar Nov 02 '22 21:11 GMNGeoffrey