lucet icon indicating copy to clipboard operation
lucet copied to clipboard

Support linking .so with LLD

Open iximeow opened this issue 4 years ago • 0 comments

It turns out trying to use lld as an alternate linker causes some issues for Lucet. One is possibly an lld bug. The other is, I think, somewhere in our dependencies.

  • [ ] Link .eh_frame in a non-crashy way.
  • [ ] lucet-objdump should be able to process lucetc artifacts linked by lld.

.eh_frame

We do .eh_frame kind of strangely: typically function addresses would be 4-byte PC-relative relocations resolved at link-time, so .eh_frame doesn't need further relocations at load-time. We currently rely on the function relocation interface through cranelift-object, implemented through write_function_addr, which results in a address-size absolute relocation to the function, instead. Since the binary is PIC, these addresses are adjusted by whatever ASLR happens to shift the address space by, and need to be relocated.

Consequently, our .eh_frame has relocations. GNU ld doesn't mind, and our .eh_frame section ends up in a read/write segment. Not ideal, relro would be better, but it works. lld ends up putting .eh_frame in a read-only segment, and eventually a dlopen will segfault from trying to write to read-only memory backing .eh_frame.

Our options here are:

  • Fix lld? I assume ignoring lucetc's declaration of .eh_frame as writable is a bug.
  • Use PC-relative relocations in .eh_frame? This should let us simply not relocate .eh_frame when loading Lucet modules.

lucet-objdump

It looks like object skips the relocation section ours end up in. It looks like lld prefers to write relative relocations as 0, putting the offset in the relocation's addend? Lucet modules linked through lld have nulls for the serialized module pointers so the trick we rely on, being able to just follow those out of the binary to the right address, doesn't work.

A great followup question might be: why are there relocations in SerializedModule? The entire thing ought to be static; none of SerializedModule, ModuleData, the table list, or function manifest, move relative to one another at any point. I think what's happening is lucet_module_data ends up in .data, but everything else ends up in .data.rel.ro because they have references to functions (so, relocations), and the linker can no longer collapse these all together. It's not hard to imagine a world where function addresses through ModuleData are all also PC-relative, so hopefully we could remove the need for relocations here too.

But why lld?

We've seen it link larger programs more quickly than gnu ld, shaving a few minutes from CI times. It'd be nice to use faster tools :grin:

iximeow avatar Jun 30 '20 00:06 iximeow