alive2 icon indicating copy to clipboard operation
alive2 copied to clipboard

Non-byte-sized (e.g. i1) global constants are broken

Open nikic opened this issue 3 years ago • 8 comments

https://alive2.llvm.org/ce/z/J7_Uvt

This says:

Mismatch in pointer(non-local, block_id=0, offset=0)
Source value: #x01
Target value: #x01

Which is the same value ...

I guess what it tries to tell us is that the high bits are unspecified, but it's not really clear from the error.

nikic avatar Apr 08 '22 14:04 nikic

Thanks! The bug is more profound, it's not the error message. It's the non-byte-sized global constant: @g_i1 = constant i1 true. Since padding is non-deterministic, the result of the 2 stores to initialize the global is different. It's probably a bug in the formula quantifiers.

nunoplopes avatar Apr 08 '22 14:04 nunoplopes

see https://github.com/llvm/llvm-project/commit/659871cede9e3475c5de986ba4cace58e70f4801

nunoplopes avatar Apr 14 '22 17:04 nunoplopes

I'm struggling with what's the right semantics @nikic.

The load and the stores have different bitwidth. Langref says:

When loading a value of a type like i20 with a size that is not an integral number of bytes, the result is undefined if the value was not originally written using a store of the same type.

What does exactly mean? Are padding bits undef or poison? We certainly don't want it to be UB.

What's happening right now is that we force the padding of source to be == -1 so the return value check works. But then we don't have the same constraint for target, so padding can be e.g. 0 and the whole thing fails.

Is it ok to define padding bits as poison, like we do for aggregates? That way, a load i8 of a store i1 would be poison.

nunoplopes avatar Apr 16 '22 13:04 nunoplopes

@nunoplopes Yeah, I'm also not sure what the intended semantics here are. We probably need a LangRef update to clarify.

I think the practical constraints here are that we want to make "store i1" padding with zero bits legal, and "store of loaded value elision" legal (meaning that bits remain unchanged). For the latter, I think it's necessary to define the padding bits as poison.

If the bits are poison, then https://github.com/llvm/llvm-project/commit/930a68765dff96927d706d258ef0c2ad9c7ec2ab could also be reverted.

nikic avatar Apr 16 '22 13:04 nikic

I've implemented the padding-as-poison semantics, and at least one the Alive2 unit tests, we have only 1 failure:

; tests/alive-tv/bugs/pr41949.srctgt.ll

define i32 @src(* %p) {
  %u = alloca i64 4, align 4
  %a = bitcast * %u to *
  %b = bitcast * %u to *
  store i32 4294967295, * %a, align 4
  store i12 20, * %b, align 2
  %v = load i32, * %u, align 4
  ret i32 %v
}
=>
define i32 @tgt(* %p) {
  %u = alloca i64 4, align 4
  %a = bitcast * %u to *
  store i32 22020095, * %a, align 4
  %v = load i32, * %u, align 4
  ret i32 %v
}

Removing a smaller store before a wider load becomes valid as it would make the load poison anyway. Seems harmless. Moreover, since this stuff has been broken multiple times, I guess it's easier for us to define whatever semantics we want.

I will test the patch over LLVM's test suite next. Unfortunately Microsoft already realized I don't work there anymore 😅 and they scrapped my VM. I need to find a new server..

nunoplopes avatar Apr 16 '22 14:04 nunoplopes

Tests finished. No regressions in LLVM's test suite. The only difference is your test that gets fixed.

I think we should be good to go. Any padding will now be defined as poison, which makes things consistent with aggregates. @nikic can you lead this change on LLVM's side? I'm a bit short on time till the end of the semester. Thank you!

nunoplopes avatar Apr 16 '22 18:04 nunoplopes

Thanks for testing! And yeah, I can put up a LangRef patch sometime next week.

nikic avatar Apr 16 '22 19:04 nikic

for reference: https://reviews.llvm.org/D123991

nunoplopes avatar Sep 03 '22 13:09 nunoplopes