libtock-rs icon indicating copy to clipboard operation
libtock-rs copied to clipboard

Fix relocation

Open Woyten opened this issue 7 years ago • 18 comments

In order to make global variables and dynamic dispatch work, we need to compile binaries conforming to the R_ARM_SBREL32 relocation model.

As far as I understand we need to perform two steps:

  • [x] Migrate the code of tock/userland/libtock/crt0.c to Rust
  • [ ] Pass compiler flags to the LLVM/LLD toolchain equivalent to:
    -msingle-pic-base
    -mpic-register=r9
    -mno-pic-data-is-text-relative
    

Woyten avatar Mar 23 '18 19:03 Woyten

@alevy I played around with the relocation problem the whole weekend but I am completely lost now.

My findings:

Relocations are not emitted by default. They can be emitted via -C link-args=--emit-relocs.

I am not sure whether -C relocation-model=ropi-rwpi represents the correct relocation model due to the following problems I encountered:

  • vtables point to adresses above 0x80000000. Accessing them, obviously, crashes the program. I tested the vtable value using the following code:
    let my_int = 5usize;
    let vtable_location = mem::transmute::<_, (usize, usize)>(&my_int as &MyTrait).1; // 0x00020b64 (ACCESSIBLE)
    let first_vtable_entry = ptr::read_volatile(vtable_location as *const usize);     // 0x80000737 (NOT ACCESSIBLE)
    
  • static muts crash during link time. The following example results in an unrecognized reloc error:
    static mut STATIC_MUT: usize = 0;
    debug::print_as_hex(STATIC_MUT);
    STATIC_MUT = 1;
    debug::print_as_hex(STATIC_MUT);
    
  • There are relocations of different types for the trait objects but none of them is of type R_ARM_SBREL32. I queried the relocations using:
    readelf --relocs -W cortex-m4.elf|rg MyTrait
    
    The printed relocation types are R_ARM_THM_MOVW_PREL_NC, R_ARM_THM_MOVT_PREL and R_ARM_ABS32.
  • The data segment is empty.

If, on the other hand, I build the code using -C relocation-model=pic, I observe the following:

  • vtables still don't work. In fact, they crash a little earlier:
    let my_int = 5usize;
    let vtable_location = mem::transmute::<_, (usize, usize)>(&my_int as &MyTrait).1; // 0x8002002c (NOT ACCESSIBLE)
    
  • static muts can be linked but they point to garbage:
    static mut STATIC_MUT: usize = 99;
    let dereferenced = &STATIC_MUT as *const _ as usize; // 0x8002002c (NOT ACCESSIBLE)
    
  • Relocations are of type R_ARM_ABS32 and R_ARM_REL32. This seems closer to what we want but it's still not R_ARM_SBREL32.
  • There is a data section but I cannot tell whether the content makes sense.

In any case, no matter which relocation model I choose:

  • The GOT is empty. Do we expect elements in it? I guess not as we compile a static binary from scratch.
  • The value of r9 has no effect. I would expect that some relocatable references depend on r9 according to https://github.com/llvm-mirror/lld/commit/29241e38d2d2258badad0226afded382525c1aa4.
  • The reldata part of the _start header is located at 0x80000000 which, again, is not accessible.

Do you think I am on the right track?

Woyten avatar Aug 22 '18 21:08 Woyten

@torfmaster and I were finally able to prove that trait objects can work in Tock OS. See #56 for more details.

Unfortunately, I cannot recommend applying the strategy mentioned in the PR. It has too many drawbacks (like no real position independence) and relies on hacks or details that might cease to be valid in a newer version of rustc.

In order to get libtock-rs binaries running properly we need to fix some external tools. The following strategy should enable the remaining Rust features:

  • [x] Fix trait objects We think that rustc contains a bug in the ropi relocation model leading to corrupt vtable lookups. Our compiled binaries try to find vtable functions at absolute addresses. This, however, conflicts with the idea of position independent code execution. The most probable reason for the bug is that an offset based on the program counter has been forgotten.

  • [x] Fix static muts static muts can be compiled but not linked. According to https://github.com/llvm-mirror/lld/commit/29241e38d2d2258badad0226afded382525c1aa4, LLD supports R9 based relocation. In practice, it refuses to process the emitted relocation types. This could be a problem with the LLD version used by rustc.

  • [ ] Improve string literal ergonomics If we want to print the string literal to the console, we need to manually copy it from flash to RAM first (e.g. by using String::from). Otherwise, the allow operation of the kernel will crash because we are not allowed to allow memory on the flash. My proposed possible solutions to the problem:

    1. Elegant: Add a new tock syscall (e.g. allow_ro) with read-only access to the flash. There are other reasons why an allow_ro syscall is a good idea like better borrow checker support.
    2. Difficult: Relocate the string literals from flash to RAM during startup. The copy step is easy. The difficult part is to adapt rustc, s.t. string literals are no longer accessed in flash but in RAM. I think that's what libtock-c is doing. It also requires that the linker problem mentioned above is solved.
    3. Poor: Ignore the problem and enforce owned Strings (needs allocation, we want to opt-out for it) and/or write! (slow).

@alevy What do you think about those problems? We would be happy if someone else could help fixing the rustc and LLD problems. The rustc problem might be interesting for @japaric and the embedded working group as well.

Woyten avatar Oct 06 '18 16:10 Woyten

There are at least two problems getting in the way of rustc support for ROPI-RWPI:

  1. LLVM's ROPI-RWPI implementation does not move .rodata values that are relocated into .data, which prevents the relocations from being implemented on microcontrollers (.rodata is truly RO on flash).
  2. For some reason, inter-crate references that should be using ROPI relocation use RWPI relocation. This issue is rustc specific; I was unable to reproduce it using clang.

In the meantime, static linking of Rust apps appears to be possible (avoiding relocation entirely). I'm putting together a PR to implement static linking. Fortunately, static linking doesn't require any code changes that are incompatible with ROPI-RWPI, although it requires linker script changes.

jrvanwhy avatar Jan 19 '19 00:01 jrvanwhy

Static linking works as of #64 ; making ROPI-RWPI relocation work correctly is a larger problem that'll take longer to solve.

jrvanwhy avatar Feb 08 '19 22:02 jrvanwhy

Will this be achieveable on platforms other than ARM? We may wish to execute embassy on more achitectures.

luojia65 avatar May 09 '21 08:05 luojia65

From the perspective of libtock-rs, I think the hope is for this to be achieved on both ARM and RISC-V eventually. Unfortunately we are blocked on upstream support in LLVM for PIC in both cases. Thus, currently it does not work on either architecture -- though libtock-c does support relocatable apps on ARM only, thanks to using gcc rather than LLVM.

I think the latest status of RISC-V ROPI/RWPI can be followed here: https://github.com/riscv/riscv-elf-psabi-doc/issues/128

And the latest status for the issues with ARM thumb targets can be followed here: https://github.com/rust-lang/rust/issues/54431

hudson-ayers avatar May 10 '21 14:05 hudson-ayers

I'm also rather interested in working relocations. Having read the Rust-lang thread, LLVM exchange, and the rust-embedded IRC log, I noted two statements that stand out.

In the LLVM emails, "I don't think such transformation belongs into clang.", regarding initializers, and "for apps you could roll your own in-kernel dynamic linker", from the IRC discussion.

Was it considered to ignore the ROPI/RWPI approach, and instead rely on relocations and fix them up using a linker? That step could even take place in tockloader, while flashing (assuming that applications once flashed aren't going to be moved again).

If there are still problems with relocations not being emitted enough, the actual step of linking object files could be moved to flash-time, with the relevant offsets (or linker files) calculated based on where the app is going to land.

If any of those approaches is not totally crazy, I'm willing to try implementing it - loadable applications are a must for me.

dcz-self avatar Mar 05 '22 14:03 dcz-self

lowRISC is working on an ePIC implementation for RISC-V and hopes to upstream it to LLVM: https://github.com/lowRISC/epic-c-example / https://github.com/lowRISC/llvm-project/commits/epic. Our current hope is that this work will at least make loadable applications possible for RISC-V, though support for ARM may take longer as I do not believe lowRISC currently plans port this work to other architectures.

hudson-ayers avatar Mar 05 '22 17:03 hudson-ayers

Was it considered to ignore the ROPI/RWPI approach, and instead rely on relocations and fix them up using a linker? That step could even take place in tockloader, while flashing (assuming that applications once flashed aren't going to be moved again).

If there are still problems with relocations not being emitted enough, the actual step of linking object files could be moved to flash-time, with the relevant offsets (or linker files) calculated based on where the app is going to land.

I'm pretty sure that your idea is workable. It is not a solution that works for every user of libtock-rs, which is why lowRISC is working on ePIC (but as Hudson mentioned, they're primarily focused on RISC-V).

One other solution that the Tock project has looked at (which libtock-c uses) is to compile each process multiple times for different locations, and have tockloader choose which TBF file to deploy on a system based on the addresses it is compiled for.

jrvanwhy avatar Mar 05 '22 18:03 jrvanwhy

Didn't libtock-c use actually position-independent binaries? That's what I gathered from the discussion about libtock-rs.

dcz-self avatar Mar 06 '22 08:03 dcz-self

libtock-c uses actually position-independent binaries for ARM targets, but gcc does not support position-independent binaries for RISC-V.

hudson-ayers avatar Mar 06 '22 18:03 hudson-ayers

Thanks. I just realized that static linking also makes the RAM address fixed, which is rather suboptimal when applications are meant to be able to be loaded in any order. Perhaps some form of PIC with rwdata section relocations at runtime could solve that - if such relocations are supported by the compiler.

dcz-self avatar Mar 06 '22 19:03 dcz-self

Thanks. I just realized that static linking also makes the RAM address fixed, which is rather suboptimal when applications are meant to be able to be loaded in any order.

Yes -- when I said "compile each process multiple times for different locations", each "location" is a combination of a flash address range and a RAM address range.

Perhaps some form of PIC with rwdata section relocations at runtime could solve that - if such relocations are supported by the compiler.

I do not think that is possible with any relocation mode that LLVM supports, unfortunately.

jrvanwhy avatar Mar 07 '22 02:03 jrvanwhy

@dcz-self this (rather old) blog post explains a little bit of the complexity with PIC: https://www.tockos.org/blog/2016/dynamic-loading/

libtock-c works because GCC supports the particular kinds of variants of PIC we need, while LLVM doesn't (actually there was a reasonably complete patch from somebody at ARM, I believe, back in the day but it wasn't accepted).

Perhaps some form of PIC with rwdata section relocations at runtime could solve that - if such relocations are supported by the compiler.

Proposals are very welcome! The main constraint are: (1) code lives in flash, not RAM, and we probably don't want to be rewriting flash on every process reboot (because of write degredation and performance) and (2) the binary size should be reasonably small---all the extra information retained for dynamic loading in, e.g., Linux ELFs results in executables the are typically way too big for the target platforms. But neither of these means there isn't some sweet spot design that is possible.

alevy avatar Mar 18 '22 15:03 alevy

With Rust merged into GCC 13, will that eventually make it possible to resolve this as the implementation matures?

potto216 avatar Dec 20 '22 22:12 potto216

With Rust merged into GCC 13, will that eventually make it possible to resolve this as the implementation matures?

rustc_codegen_gcc is on track to be usable well before GCC's Rust frontend, so I don't think that changes anything. Either way, GCC only supports the necessary relocation mode on ARM, not RISC-V, so it's not a complete solution.

If ePIC ends up being RISC-V only, we may end up implementing relocation using rustc_codegen_gcc on ARM and ePIC on RISC-V.

jrvanwhy avatar Dec 20 '22 23:12 jrvanwhy