Odin icon indicating copy to clipboard operation
Odin copied to clipboard

Alignment problem on @static_local variable

Open semarie opened this issue 3 years ago • 2 comments

Context

$ odin report
Where to find more information and get into contact when you encounter a bug:

        Website: https://odin-lang.org
        GitHub:  https://github.com/odin-lang/Odin/issues


Useful information to add to a bug report:

        Odin: dev-2022-03:a70dde34
        OS:   OpenBSD 7.1 GENERIC.MP#386 amd64
        CPU:  AMD Athlon 3000G with Radeon Vega Graphics     
        RAM:  30619 MiB

odin is linked with LLVM 13.0.0

Considering the following program, which only allocate some bytes from temp_allocator before exiting:

package test

main :: proc ()
{
    data := make([]byte, 10, context.temp_allocator)
}

When compiled with optimization enabled (-opt:2), the program segfault on exit (inside __$cleanup_runtime).

$ odin build test.odin -out:test -opt:2 -debug
$ gdb ./test
(gdb) run

Program received signal SIGSEGV, Segmentation fault.
runtime.default_temp_allocator_destroy (s=<optimized out>) at /home/semarie/repos/odin/odin/core/runtime/default_temporary_allocator.odin:45
45                      delete(s.leaked_allocations)
(gdb) bt
#0  runtime.default_temp_allocator_destroy (s=<optimized out>) at /home/semarie/repos/odin/odin/core/runtime/default_temporary_allocator.odin:45
#1  __$cleanup_runtime () at /home/semarie/repos/odin/odin/core/runtime/core.odin:459
#2  0x0000078e4a7ffe64 in main (argc=<optimized out>, argv=<optimized out>) at /home/semarie/repos/odin/odin/core/runtime/entry_unix.odin:30

the generated code (with -opt:2) is the following:

   0x0000078e4a7fc975 <+277>:     mov    QWORD PTR [rsp+0x88],rax
   0x0000078e4a7fc97d <+285>:     mov    QWORD PTR [rsp+0x78],r12
   0x0000078e4a7fc982 <+290>:     mov    QWORD PTR [rsp+0x98],0x1e
   0x0000078e4a7fc98e <+302>:     mov    QWORD PTR [rsp+0x80],0x4b
=> 0x0000078e4a7fc99a <+314>:     movaps xmm0,XMMWORD PTR [r15+0x30]
   0x0000078e4a7fc99f <+319>:     movaps XMMWORD PTR [rsp+0x20],xmm0
   0x0000078e4a7fc9a4 <+324>:     mov    r9,QWORD PTR [rsp+0x20]
   0x0000078e4a7fc9a9 <+329>:     movups xmm0,XMMWORD PTR [r15+0x48]
   0x0000078e4a7fc9ae <+334>:     movaps XMMWORD PTR [rsp],xmm0

the segfault occurs on movaps xmm0,XMMWORD PTR [r15+0x30].

The requirements for movaps on XMM are:

  • size : it will read/write 128-bit on memory (16 bytes)
  • alignment: it will require 16-bytes alignment

The values I've seen are:

(gdb) print /x $r15+0x30
$1 = 0xff5f0001b8
(gdb) print /x $r15
$2 = 0xff5f000188

The memory block holding 0xff5f000188 was allocated using malloc(size=111), and returned 0xff5f000180: the valid memory range is [0xff5f000180 - 0xff5f0001ef[.

So $r15+0x30 fits: there is at least 16 bytes to be read (55 bytes exactly). but it is 8-bytes aligned, so movaps requirement isn't meet. It explains the SEGFAULT.

At this point of the code, r15 hold the global_default_temp_allocator_data address, which is @static_local. On OpenBSD, TLS variable are implemented using emutls.

After instrumenting a bit the code, I saw that emutls was explicitly asked to have global_default_temp_allocator_data stored on 8-byte alignment: the code for allocating the memory for holding the variable used emutls_memalign_alloc (align=8, size=96). The alignment and size values are generated by LLVM backend.

At this point, I am unsure how to properly resolv the issue.

From odin code, it seems alignment isn't explicitly set (in the following code, LLVMGetAlignment() is 0 before setting the alignment). If I sets it to type_align_of(e->type), it sets the value to 8, which seems still wrong (emutls_memalign_alloc is still done with align=8), but the program works (I am unsure why).

--- src/llvm_backend.cpp
+++ src/llvm_backend.cpp
@@ -1522,10 +1522,12 @@ void lb_generate_code(lbGenerator *gen) {
 		lbValue g = {};
 		g.value = LLVMAddGlobal(m->mod, lb_type(m, e->type), alloc_cstring(permanent_allocator(), name));
 		g.type = alloc_type_pointer(e->type);
 		if (e->Variable.thread_local_model != "") {
 			LLVMSetThreadLocal(g.value, true);
+			LLVMSetAlignment(g.value, type_align_of(e->type));
+			gb_printf("alignment = %d\n", LLVMGetAlignment(g.value));
 
 			String m = e->Variable.thread_local_model;
 			LLVMThreadLocalMode mode = LLVMGeneralDynamicTLSModel;
 			if (m == "default") {
 				mode = LLVMGeneralDynamicTLSModel;

The diff should also be incomplete as the same code construction could be found at several places (src/llvm_backend_general.cpp and src/llvm_backend_stmt.cpp too).

semarie avatar Mar 01 '22 15:03 semarie

Setting an alignment (using LLVMSetAlignment()) makes llvm to generate a different opcode to access the memory.

Before, it used movaps (Move Aligned Packed Single-Precision Floating-Point Values) and failed because the memory wasn't aligned as expected, and after it is using movups (Move Unaligned Packed Single-Precision Floating-Point Values).

semarie avatar Mar 02 '22 08:03 semarie

Hello!

I am marking this issue as stale as it has not received any engagement from the community or maintainers 120 days. That does not imply that the issue has no merit! If you feel strongly about this issue

  • open a PR referencing and resolving the issue;
  • leave a comment on it and discuss ideas how you could contribute towards resolving it;
  • leave a comment and describe in detail why this issue is critical for your use case;
  • open a new issue with updated details and a plan on resolving the issue.

The motivation for this automation is to help prioritize issues in the backlog and not ignore, reject, or belittle anyone..

github-actions[bot] avatar Jul 24 '22 22:07 github-actions[bot]