cheribsd icon indicating copy to clipboard operation
cheribsd copied to clipboard

rtld: Always check LD_CHERI/LD_64 evironment variables

Open arichardson opened this issue 4 years ago • 6 comments

With this change a pure-capability RTLD will check LD_CHERI_* and if those are not set fall back to checking LD_*. This makes it possible to have scripts that set variables such as LD_64_LIBRARY_PATH and work on both a non-CHERI 64-bit system and a pure-capability world with a compat64 rtld. Without this change we would have to check what the native ABI is first and then either set LD_64_LIBRARY_PATH (purecap) or LD_LIBRARY_PATH (hybrid/non-CHERI).

If you agree this makes sense, I'll try to upstream this so that we can also use LD_64_LIBRARY_PATH for upstream FreeBSD.

arichardson avatar Dec 23 '20 12:12 arichardson

We had some crazy hacks at Y! though in a bit of a different direction (the 32-bit libc would rewrite attempts to setenv() LD_FOO to LD_32_FOO so that applications that set LD_FOO variables still worked when run under compat32).

I think that you want to only check the fallback for the "native" ABI. That is, if any COMPAT_FOO is set you want to sabotage LD_FALLBACK (so COMPAT_CHERI for a ld-elf-cheri on a hybrid world, or COMPAT_64BIT for ld-elf64 on a purecap world). Only ld-elf.so.1 should check LD_*.

I would also be tempted to make the LD_ the "fallback" that is checked second in the native linker with LD_* having precedence, but that's probably a question for the upstream review. I can see arguments both ways.

I originally wanted to use that approach. However, the problem I see with not doing it for COMPAT_* is that it can break 3rd-party non-CHERI software on a purecap CHERI system: I've seen a few Linux programs that bundle specific versions of libraries and are launched using a wrapper scripts that add the lib/ directory with LD_LIBRARY_PATH. This would no longer work if your world is purecap then since it will only consider LD_64_* and no longer load the libraries. My guess would be that this also exists on FreeBSD.

arichardson avatar Dec 30 '20 19:12 arichardson

Yeah in the GNU world there is no LD_32 etc it's just LD. The current FreeBSD approach is more in keeping with how it does ABI in the kernel, but does require scripts know what the native ABI is in order to correctly set environment variables. I don't think you want ld-elf32 and ld-elf64 to differ in semantics though, regardless of what approach is deemed best.

jrtc27 avatar Dec 30 '20 19:12 jrtc27

I don't think you want ld-elf32 and ld-elf64 to differ in semantics though, regardless of what approach is deemed best.

I'd be happy to also read LD_* variables for ld-elf32, I was just concerned that upstream wouldn't want a change in behaviour.

arichardson avatar Dec 30 '20 20:12 arichardson

I think LD_PRELOAD in particular is a bit dangerous if all the linkers honor it. If i do `env LD_PRELOAD=foo.so ' where said program executes a purecap program (or vice versa), it's a bit of a mess, hence why upstream is explicit and does not honor "plain" LD_* for compat ABIs. Yes, it means that scripts might have to be patched, and/or the compat libc for native ABIs might need to adopt the Y! approach of rewriting the names of environment variables transparently, but it avoids crossing the streams in that way. LD_LIBRARY_PATH is probably kind of harmless since we would fail the dlopen() and try the next library search path, but LD_PRELOAD is different.

bsdjhb avatar Jan 04 '21 20:01 bsdjhb

I think LD_PRELOAD in particular is a bit dangerous if all the linkers honor it. If i do `env LD_PRELOAD=foo.so ' where said program executes a purecap program (or vice versa), it's a bit of a mess, hence why upstream is explicit and does not honor "plain" LD_* for compat ABIs. Yes, it means that scripts might have to be patched, and/or the compat libc for native ABIs might need to adopt the Y! approach of rewriting the names of environment variables transparently, but it avoids crossing the streams in that way. LD_LIBRARY_PATH is probably kind of harmless since we would fail the dlopen() and try the next library search path, but LD_PRELOAD is different.

Well there are two cases:

  1. You give a file name, in which case it'll be as if the binary had an extra DT_NEEDED and everything just works.
  2. You give a path, in which case if the ABIs aren't compatible you'll just get an error.

In both cases you get what you ask for. With glibc LD_PRELOAD is always honoured and it works just fine.

jrtc27 avatar Jan 04 '21 21:01 jrtc27

I think LD_PRELOAD in particular is a bit dangerous if all the linkers honor it. If i do `env LD_PRELOAD=foo.so ' where said program executes a purecap program (or vice versa), it's a bit of a mess, hence why upstream is explicit and does not honor "plain" LD_* for compat ABIs. Yes, it means that scripts might have to be patched, and/or the compat libc for native ABIs might need to adopt the Y! approach of rewriting the names of environment variables transparently, but it avoids crossing the streams in that way. LD_LIBRARY_PATH is probably kind of harmless since we would fail the dlopen() and try the next library search path, but LD_PRELOAD is different.

Well there are two cases:

  1. You give a file name, in which case it'll be as if the binary had an extra DT_NEEDED and everything just works.
  2. You give a path, in which case if the ABIs aren't compatible you'll just get an error.

In both cases you get what you ask for. With glibc LD_PRELOAD is always honoured and it works just fine.

And if you have a mixed environment you can always set the LD_64_PRELOAD/LD_32_PRELOAD/LD_CHERI_PRELOAD so that binaries with a different ABI ignore it. I personally much prefer allowing LD_* for all ABIs over libc getenv/putenv hacks.

arichardson avatar Jan 05 '21 11:01 arichardson