c3c icon indicating copy to clipboard operation
c3c copied to clipboard

Wrong array offset for structs containing overaligned structs

Open cbuttner opened this issue 1 year ago • 7 comments

This is bad...

module test;
import std::io;

struct Test @align(16) {
  void* foo;
}

struct Test2 {
  Test test;
  uint a;
}

fn int main(String[] args) {
  Test2* array;
  io::printfn("Size of Test2 is %s???", (uptr)&array[1] - (uptr)array);
  assert((uptr)&array[1] == Test2.sizeof);
  return 0;
}

cbuttner avatar Jul 02 '24 10:07 cbuttner

I found another issue with unions as well, not quite the same but related. @align is a bit under-tested. I added some test cases, but I feel there might be some surprises still lurked in the intersection between @packed and @align. Inspecting the LLVM output here to see the actual layout given to LLVM is essential to ensure that the behaviour is correct.

lerno avatar Jul 02 '24 13:07 lerno

The size and layout of the struct was not wrong in my example. It was the array offset specifically.

cbuttner avatar Jul 02 '24 18:07 cbuttner

Well, yes that can happen too.

lerno avatar Jul 02 '24 21:07 lerno

Why can that happen, why isn't the struct size used directly?

Also your test doesn't check for the array case.

cbuttner avatar Jul 03 '24 11:07 cbuttner

So the reason this happens is that these are the two types generated:

%Test = type { ptr, [8 x i8] }
%Test2 = type { %Test, i32 }

In this case, the natural alignment of %Test is 8. The way align works here is not to change the alignment of %Test – that's not possible with LLVM, but rather the lowering has to ensure that %Test is always aligned to 16. In addition to this, there is the additional filler [8 x i8] to ensure the correct size (this could have been i64 too, but it's just easier to do arbitrary padding this way).

Now next up, %Test is embedded in %Test2. On the frontend, this makes %Test2 16 byte aligned. However, it was looking at the alignment of Test (16!) and assumed that the natural LLVM alignment of %Test is 16, so no padding would be needed to fill out the end. However, as previously mentioned the LLVM %Test only has its natural alignment (8) and so the implicit padding of LLVM is only to add 4 bytes extra (to ensure 8 alignment), and not 12 bytes. So basically this caused the frontend to miss adding necessary padding to the LLVM type, and the ACTUAL size of the LLVM type was smaller than expected, which caused the issue with the offset when using LLVM pointer offset calculations. In some places, the LLVM size of the type doesn't even enter the equation, so things are correct by accident.

Hope that explains things.

lerno avatar Jul 03 '24 13:07 lerno

Oh, and I added an additional unit test for it to test the runtime size. If you want to add more tests (those that pass are fine, they are important for regression) in test/unit/regression/struct_alignment.c3 you're very welcome to do so.

lerno avatar Jul 03 '24 13:07 lerno

Oh, and interestingly LLVM types pointer offset calculation might get phased out of LLVM, in which case the missing padding at the end might have gone undetected! 😄

lerno avatar Jul 03 '24 13:07 lerno

Thanks for the explanation, that makes sense to me.

The issue looks fixed in my codebase too.

cbuttner avatar Jul 05 '24 13:07 cbuttner