lucet
lucet copied to clipboard
Support linking .so with LLD
It turns out trying to use lld
as an alternate linker causes some issues for Lucet. One is possibly an lld
bug. The other is, I think, somewhere in our dependencies.
- [ ] Link
.eh_frame
in a non-crashy way. - [ ]
lucet-objdump
should be able to processlucetc
artifacts linked bylld
.
.eh_frame
We do .eh_frame
kind of strangely: typically function addresses would be 4-byte PC-relative relocations resolved at link-time, so .eh_frame
doesn't need further relocations at load-time. We currently rely on the function relocation interface through cranelift-object
, implemented through write_function_addr
, which results in a address-size absolute relocation to the function, instead. Since the binary is PIC, these addresses are adjusted by whatever ASLR happens to shift the address space by, and need to be relocated.
Consequently, our .eh_frame
has relocations. GNU ld
doesn't mind, and our .eh_frame
section ends up in a read/write segment. Not ideal, relro would be better, but it works. lld
ends up putting .eh_frame
in a read-only segment, and eventually a dlopen
will segfault from trying to write to read-only memory backing .eh_frame
.
Our options here are:
- Fix
lld
? I assume ignoring lucetc's declaration of.eh_frame
as writable is a bug. - Use PC-relative relocations in
.eh_frame
? This should let us simply not relocate.eh_frame
when loading Lucet modules.
lucet-objdump
It looks like object
skips the relocation section ours end up in. It looks like lld
prefers to write relative relocations as 0
, putting the offset in the relocation's addend? Lucet modules linked through lld
have nulls for the serialized module pointers so the trick we rely on, being able to just follow those out of the binary to the right address, doesn't work.
A great followup question might be: why are there relocations in SerializedModule
? The entire thing ought to be static; none of SerializedModule
, ModuleData
, the table list, or function manifest, move relative to one another at any point. I think what's happening is lucet_module_data
ends up in .data
, but everything else ends up in .data.rel.ro
because they have references to functions (so, relocations), and the linker can no longer collapse these all together. It's not hard to imagine a world where function addresses through ModuleData
are all also PC-relative, so hopefully we could remove the need for relocations here too.
But why lld
?
We've seen it link larger programs more quickly than gnu ld
, shaving a few minutes from CI times. It'd be nice to use faster tools :grin: