liballocs
liballocs copied to clipboard
Dummyweaks library should go away
The no-op liballocs_dummyweaks.so library is no longer necessary, for at least the following reasons.
- The BFD linker provides
-z dynamic-undefined-weak. So we can just make the symbols weak, if we don't mind testing for liballocs' presence at every use. We might mind, though. - Thanks to allocsld, we can always preload liballocs transparently. Or, if we cared, we could preload a stripped-down no-op version. That would avoid the 'test at every use' problem, but wouldn't fully eliminate the library as a built artifact.
- A hybrid might be possible -- test for some symbols only.
- We could link dummy version of the symbols into allocsld -- although not yet, because it ls only a chain loader, so the real ld.so doesn't see it yet.
- We could always load the real liballocs but make it disable-able. This would mean stubbing out its own entry points, maybe by dynamic patching or perhaps by some IFUNC trickery. Indeed making liballocs's entry points IFUNCS might be the cleanest way to allow run-time disabling.
What clients care about this? Anyone that links -lallocs presumably doesn't, though that's not entirely clear. It may be only libcrunch_stubs.so that cares.
If allocsld always emulates the 'requested' case of dynamic linking ($ ./my.binary), not the 'invoked' case ($ /path/to/ld.so my.binary), then we get to map the executable before the real ld.so runs. That might allow us some useful jiggery-pokery. E.g. we could fake out the program headers, if that's any use... ditto the DYNAMIC segment, potentially. So perhaps we could make it look like the executable contains the symbols we care about -- all symbols for the whole of liballocs, potentially, if we bundle it all into allocsld -- even though it's not actually part of the executable DSO. This appeals because it's a lot like the ultimate preloading: the executable comes first in the link order, so we're making this really-preloaded stuff appear to be in the executable.
It would be good to proof-of-concept this hack. E.g. maybe the ld.so barfs if it reads DYNAMIC data that's not within any of the executable's LOADs.
This all relates to how we package the liballocs implementation, which currently is still 99% in a preload DSO.
To get our hooks a little deeper into the dynamic linker, perhaps we could act as an LD_AUDIT client. That probably means building a special DSO, in addition to allocsld, that does the auditing. It will get loaded and called by the real ld.so.
Then, probably, the combination of chain loader and auditer can replace preloading entirely. The audit DSO would mostly call back to code in allocsld, I guess. Need to think about how sane this is/isn't. Perhaps the audit API lets us inject a preload object? We want our run-time code (i.e. after load-time / allocsld has done its thing) to live somewhere in the link map, for meta-completeness reasons.
When devising a solution it's important to think about the static-linking case, which we want to support (allocsld could load a static binary).
E.g. could we fake out a subset of the LD_AUDIT API that would make sense in the static case? I guess we're then doing no link-time interposition, so maybe the answer is: yes but it's trivial (empty subset!) and we have bigger problems (how do we catch malloc/dlopen/... calls? load the dlbind library? and so on).
Back to the 'undefined weak' situation: instead of testing at every use, we could lightly hack the binary so that the dummy weaks are defined, locally, and what we have is a 'defined but overridable weak' situation. To do this, we would have to put a copy of the dummies in every output binary, and hack the UND dynsym so that it instead is a weak definition, pointing to the relevant stub. Since the dynamic reloc record will still be present, this should still be overridable at load time by the 'real' liballocs. The important thing is that clients can use the dummies, harmlessly, without testing for their presence... and similarly no extraneous test is done when the real liballocs is loaded.
We could perhaps even do this generatively, i.e. scan for UND weaks that we recognise, and point them at some code that just clears %rax and returns (if that is appropriate in all dummies! might not be, especially as not all dummyweaks are code... but you get the idea).
Perhaps one way to go is to bundle the stubs into our ld.so? Hmm. This is a can of worms.
The thinking was that allocsld.so can be a library on the linker command line, just like the ordinary ld.so can be. It's an appealing place to shove a stub version of our functions, to be overridden by the 'real' preload library, hence avoiding the 'explicit testing for weaks' problem.
So what if we have both allocsld (or liballocs.so linker script pointing at allocsld) and the ordinary ld.so on the command line?
(It's a can of worms because normally allocsld.so is not involved in linking. It is a chain loader only. This way, it would be filling both that role and the role of the dummyweaks library. It's not clear that it's better than a separate dummyweaks library.)
Recall: we want to support the "instrumentation disabled" case. That's why we don't just have allocscc do -lallocs_preload and then (at init time) check we're loaded in preload position.
I've a feeling IFUNCs are the answer. They're exactly answering the use case: choose a relocation target based on run-time information. We use the 'real' version if liballocs is in preload position, the 'fast dummy' version if not. We can still preload liballocs on an oblivious process -- it just isn't binding to anything.
If we have a non-oblivious process, it is using allocsld. So we can't just have allocsld always enable preloading. However, many options exist for how to signal that it should use the dummy versions.
Feel like the idea is the same binary symlinked in three ways:
- liballocs.so: for clients to use by linking -lallocs, and what must be in preload position to have effect
- allocsld-noop.so: the default dynamic linker set in allocscc-built binaries; does not force liballocs.so into preload position
- allocsld.so: the dynamic linker to run explicitly, and forces liballocs.so into preload position
Perhaps we don't need the '-noop' symlink? It's confusing because its effect is not always 'noop'. To keep it simpler: running explicitly ('invoking') is the non-noop case, otherwise it depends on the preload position ('requesting'). So we have...
- liballocs.so: for clients to use by linking -lallocs, and what must be in preload position to have effect
- allocsld.so: always used for allocscc-built binaries; if invoked explicitly, forces liballocs.so into preload position (otherwise depends on LD_PRELOAD).