wasmtime icon indicating copy to clipboard operation
wasmtime copied to clipboard

cranelift: Add COFF TLS Support

Open afonso360 opened this issue 3 years ago • 5 comments

👋 Hey,

This PR is an initial draft at adding support for resolving TLS variables in COFF file formats (Windows). This is the current main blocker on having cg_clif working on windows.

There are still some pending issues in this PR:

  • Do we need to mark the gs register as used?
  • Declaring the _tls_index symbol. No idea if it is correct or not.

It is my understanding (mostly based on #1885) that we still need to add COFF TLS support to the object crate for this to become functional.

I'm going to look into modifying the object crate to see how easy/hard its going to be to add support for this, but would really appreciate any help.


Based on my tests compiling cranelift/filetests/filetests/isa/x64/tls_coff.clif we emit pretty much the equivalent machine code and relocations as rustc does.

Disassembly of rustc output:
0000000000000000 <getaddr>:
     0: 8b 05 00 00 00 00            	movl	(%rip), %eax  # 6 <getaddr+0x6>
		0000000000000002:  IMAGE_REL_AMD64_REL32	_tls_index
     6: 65 48 8b 0c 25 58 00 00 00   	movq	%gs:88, %rcx
     f: 48 8b 04 c1                  	movq	(%rcx,%rax,8), %rax
    13: 48 8d 80 00 00 00 00         	leaq	(%rax), %rax
		0000000000000016:  IMAGE_REL_AMD64_SECREL	tls
    1a: c3                           	retq
Disassembly of tls_coff.clif output:
0000000000000000 <u0:0>:
     0: 55                            pushq   %rbp
     1: 48 89 e5                      movq    %rsp, %rbp
     4: 8b 05 00 00 00 00             movl    (%rip), %eax  # a <u0:0+0xa>
              0000000000000006:  IMAGE_REL_AMD64_REL32        _tls_index
     a: 65 48 8b 0c 25 58 00 00 00    movq    %gs:88, %rcx
    13: 48 8b 04 c1                   movq    (%rcx,%rax,8), %rax
    17: 48 8d 80 00 00 00 00          leaq    (%rax), %rax
              000000000000001a:  IMAGE_REL_AMD64_SECREL       .Ldata0
    1e: 48 89 ec                      movq    %rbp, %rsp
    21: 5d                            popq    %rbp
    22: c3                            retq

I think the only difference there is the target of the relocation, because in the tls_coff example I didn't name the symbol.

cc: @bjorn3 @cfallin cc: #1885

afonso360 avatar Jul 28 '22 14:07 afonso360

Do we need to mark the gs register as used?

The gs register is not allocatable by regalloc, so it doesn't need to be marked as usable.

bjorn3 avatar Jul 28 '22 14:07 bjorn3

Thanks a lot for working on this!

bjorn3 avatar Jul 28 '22 14:07 bjorn3

#![feature(thread_local)]                                                                                                                          1 ↵ bjorn@laptopbjorn-lenovo

use std::cell::Cell;

#[thread_local]
pub static FOO: Cell<u8> = Cell::new(42);
pub static mut BAR: u8 = 43;  
pub fn get_foo() -> u8 {
    FOO.get()
}

pub fn set_foo(foo: u8) {
    FOO.set(foo)
}' | rustc +nightly --target x86_64-pc-windows-msvc --emit obj - --crate-type lib -O

produces an object file for which objdump -dr gives:

rust_out.o:     file format pe-x86-64


Disassembly of section .text:

0000000000000000 <_ZN8rust_out7get_foo17h9e4ed845564f4123E>:
   0:   8b 05 00 00 00 00       mov    0x0(%rip),%eax        # 6 <_ZN8rust_out7get_foo17h9e4ed845564f4123E+0x6>
                        2: R_X86_64_PC32        _tls_index
   6:   65 48 8b 0c 25 58 00    mov    %gs:0x58,%rcx
   d:   00 00 
   f:   48 8b 04 c1             mov    (%rcx,%rax,8),%rax
  13:   8a 80 00 00 00 00       mov    0x0(%rax),%al
                        15: secrel32    _ZN8rust_out3FOO17he424d0489633cb80E
  19:   c3                      retq   

Disassembly of section .text:

0000000000000000 <_ZN8rust_out7set_foo17hb5fb9735547d2b58E>:
   0:   8b 05 00 00 00 00       mov    0x0(%rip),%eax        # 6 <_ZN8rust_out7set_foo17hb5fb9735547d2b58E+0x6>
                        2: R_X86_64_PC32        _tls_index
   6:   65 48 8b 14 25 58 00    mov    %gs:0x58,%rdx
   d:   00 00 
   f:   48 8b 04 c2             mov    (%rdx,%rax,8),%rax
  13:   88 88 00 00 00 00       mov    %cl,0x0(%rax)
                        15: secrel32    _ZN8rust_out3FOO17he424d0489633cb80E
  19:   c3                      retq   

and objdump --syms:

rust_out.o:     file format pe-x86-64

SYMBOL TABLE:
[  0](sec  1)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .text
AUX scnlen 0x0 nreloc 0 nlnno 0 checksum 0x0 assoc 1 comdat 0
[  2](sec  2)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .data
AUX scnlen 0x0 nreloc 0 nlnno 0 checksum 0x0 assoc 2 comdat 0
[  4](sec  3)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .bss
AUX scnlen 0x0 nreloc 0 nlnno 0 checksum 0x0 assoc 3 comdat 0
[  6](sec  4)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .text
AUX scnlen 0x1a nreloc 2 nlnno 0 checksum 0xa84d5eee assoc 4 comdat 1
[  8](sec  4)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x0000000000000000 _ZN8rust_out7get_foo17h9e4ed845564f4123E
[  9](sec  5)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .text
AUX scnlen 0x1a nreloc 2 nlnno 0 checksum 0xac4f1caa assoc 5 comdat 1
[ 11](sec  5)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x0000000000000000 _ZN8rust_out7set_foo17hb5fb9735547d2b58E
[ 12](sec  6)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .tls$
AUX scnlen 0x1 nreloc 0 nlnno 0 checksum 0xdbbbc9d6 assoc 6 comdat 1
[ 14](sec  6)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 _ZN8rust_out3FOO17he424d0489633cb80E
[ 15](sec  7)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .data
AUX scnlen 0x1 nreloc 0 nlnno 0 checksum 0xacbcf940 assoc 7 comdat 1
[ 17](sec  7)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 _ZN8rust_out3BAR17hab28e85eef212a9aE
[ 18](sec  8)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .data
AUX scnlen 0x8 nreloc 1 nlnno 0 checksum 0x0 assoc 8 comdat 1
[ 20](sec  8)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 __imp__ZN8rust_out3FOO17he424d0489633cb80E
[ 21](sec  9)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x0000000000000000 .data
AUX scnlen 0x8 nreloc 1 nlnno 0 checksum 0x0 assoc 9 comdat 1
[ 23](sec  9)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 __imp__ZN8rust_out3BAR17hab28e85eef212a9aE
[ 24](sec -1)(fl 0x00)(ty   0)(scl   3) (nx 0) 0x0000000000000000 @feat.00
[ 25](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x0000000000000000 _tls_index
[ 26](sec -2)(fl 0x00)(ty   0)(scl 103) (nx 2) 0x0000000000000000 rust_out.cb515bd7-
File 
File 

bjorn3 avatar Jul 28 '22 14:07 bjorn3

I managed to run a small tls example on windows! :tada:

#![feature(thread_local)]
use std::cell::Cell;
#[thread_local]
static FOO: Cell<u8> = Cell::new(42);
fn main() {
    assert_eq!(FOO.get(), 42);
    std::thread::spawn(|| {
        FOO.set(43);
        assert_eq!(FOO.get(), 43);
    }).join();
    assert_eq!(FOO.get(), 42);
}

bjorn3 avatar Jul 28 '22 17:07 bjorn3

I managed to run a small tls example on windows! 🎉

That's really cool! I'm going to try and compile cg_clif with this, to see where it breaks next. I suspect we don't have a very complete TLS implementation.

afonso360 avatar Jul 28 '22 17:07 afonso360

Rebased this on top of main. I've also renamed the TlsIndex symbol to CoffTlsIndex.

With this rebase we've also lost the lookup function. But we can add it later when we need it in the JIT, is this okay @bjorn3 ?

afonso360 avatar Aug 10 '22 20:08 afonso360

Fine with me.

bjorn3 avatar Aug 10 '22 20:08 bjorn3

Do you both feel this is ready for merge, then? I've glanced over it and it looks reasonable to me, but I'm not at all familiar with how thread-local storage works on any platform.

jameysharp avatar Aug 11 '22 00:08 jameysharp

Yes, this should be ready to merge!

afonso360 avatar Aug 11 '22 05:08 afonso360