cpython icon indicating copy to clipboard operation
cpython copied to clipboard

macOS linker warnings in macOS ventura

Open pablogsal opened this issue 2 years ago • 25 comments

Turns out ld shipped with Xcode 14+ emits a warning when using -undefined dynamic_lookup.`

ld: warning: -undefined dynamic_lookup may not work with chained fixups

Some investigation reveals that in fact -undefined dynamic_lookup doesn't work when:

  • Link a shared library with the option
  • Link it with a program that uses the chained-fixup introduced from macOS 12 and iOS 15

This is because -undefined dynamic_lookup uses lazy-bindings and they won't be bound while dyld fixes-up by traversing chained-fixup info.

However, we build extension modules with as bundles (-bundle) and they are loaded only through dlopen, so it's safe to use -undefined dynamic_lookup in theory. So the warning produced by ld64 is likely to be a false-positive.

ld64 also provides the -bundle_loader <executable> option, which allows resolving symbols defined in the executable symbol table while linking. It behaves almost the same with -undefined dynamic_lookup, but it makes the following changes:

  • Require that unresolved symbols among input objects must be defined in the executable.
  • Lazy symbol binding will lookup only the symbol table of the bundle loader executable. (-undefined dynamic_lookup lookups all symbol tables as flat namespace)

See "New Features" subsection under "Linking" section for chained fixup developer.apple.com/documentation/xcode-release-notes/xcode-13-release-notes for more information.

Also, check the same problem in ruby where most of this information is taken from.

pablogsal avatar Sep 24 '22 13:09 pablogsal

CC @ambv @ned-deily

pablogsal avatar Sep 24 '22 13:09 pablogsal

Unfortunately this warning also propagates to all extension modules built with the given Python.

pablogsal avatar Sep 24 '22 13:09 pablogsal

@ronaldoussoren

ned-deily avatar Sep 24 '22 15:09 ned-deily

-bundle_loader doesn't work with shared library builds (including the framework build), as the option does exactly as described in the documentation: It looks for symbols in the loader executable, not in shared library dependencies of that binary. I found this when looking into -bundle_loader to get link time errors when using non-existing symbols instead of only getting those when using an extension.

Linking with libpython is also problematic because on macOS shared libraries are referenced by absolute path which means linking with libpython ties the extension to a specific interpreter (e.g. makes it impossible to build an extension using the Python.org installer and load it using a homebrew install). For the user this will result in seemingly random crashes (due to libpython being loaded twice while only one is initialised).

I haven't spend time on looking into this particular issue yet, with some luck there's another option beyond disabling chained lookups.

ronaldoussoren avatar Sep 25 '22 08:09 ronaldoussoren

FYI.. https://issues.guix.gnu.org/issue/57849

Looks like using -Wl,-w will help.

CsBigDataHub avatar Oct 05 '22 23:10 CsBigDataHub

FYI.. https://issues.guix.gnu.org/issue/57849

Looks like using -Wl,-w will help.

That doesn't really help, it just suppresses the warning.

ronaldoussoren avatar Oct 06 '22 10:10 ronaldoussoren

You can pass -Wl,-no_fixup_chains to disable them entirely for now, but that flag is undocumented and who knows if it will exist in the future.

keith avatar Oct 06 '22 20:10 keith

You can pass -Wl,-no_fixup_chains to disable them entirely for now, but that flag is undocumented and who knows if it will exist in the future.

It will likely won't because chained fixups are the new linker information for the dynamic symbol tables. So we will be just delaying the inevitable

pablogsal avatar Oct 08 '22 23:10 pablogsal

Another workaround is to target an older deployment target (before 11.0), but that's also not a long term solution.

ronaldoussoren avatar Oct 25 '22 18:10 ronaldoussoren

For static builds using -Wl,-bundle_loader,$(BUILDPYTHON) -Wl,-undefined,error appears to work. I've manually patched the Makefile for some extensions and that does result in a build without warnings and I can load those extensions. What I've not checked yet is if those extensions work with a python installed into a different prefix.

This doesn't work with dynamic builds though, even when using BLDLIBRARY= -L. -Wl,-reexport-lpython$(LDVERSION) when linking $(BUILDPYTHON).

Note that just linking extensions with libpython isn't a realistic option because that results in C extensions that target a specific python prefix and hence would loose interoperability for wheels build with our installer and homebrew.

ronaldoussoren avatar Oct 25 '22 19:10 ronaldoussoren

What does work for a dynamic build (again with a tweaked $(BLDLBIRARY): -Wl,-flat_namespace -Wl,-bundle_loader,./python.exe -Wl,-undefined,error, that is add -Wl,-flat_namespace. That's not ideal though because the default two-level namespaces are more efficient (and faster).

Something similar should also work for framework builds (but using a different reexport flag), but I haven't tested this yet. Mostly because I'm not happy with the link flags I've used for a shared build.

All of this was tested by manual patching of the Makefile, I haven't looked into tweaking the configure script and makefile template yet.

ronaldoussoren avatar Oct 25 '22 19:10 ronaldoussoren

Hi — this issue has also been raised in the pybind11 and CMake projects, including a fix similar to what was suggested above by @ronaldoussoren.

Links:

  • pybind11 issue tracker ticket: https://github.com/pybind/pybind11/pull/4301
  • CMake issue tracker ticket: https://gitlab.kitware.com/cmake/cmake/-/issues/24044

All of this raises a few questions (https://github.com/pybind/pybind11/pull/4301#issuecomment-1298210494) that would be good to figure out together with somebody who is familiar with the intricacies of Darwin's dynamic linker.

wjakob avatar Nov 01 '22 14:11 wjakob

Hi @wjakob,

We are currently waiting for some clarifications from Apple. Will follow up with anything we find.

pablogsal avatar Nov 01 '22 15:11 pablogsal

Anyone with Apple clout could push this PR into their attention sphere: https://github.com/apple-opensource/ld64/pull/1

It basically says "here's a library; make any symbol that is resolved here dynamic_lookup and in the flat namespace". Which is…exactly what is wanted here.

mathstuf avatar Nov 01 '22 15:11 mathstuf

It basically says "here's a library; make any symbol that is resolved here dynamic_lookup and in the flat namespace". Which is…exactly what is wanted here.

Correct me if I'm wrong, but it seems to me like this PR targets a different use case: it enables dynamic_lookup selectively for chosen symbols to avoid the "sledgehammer" solution of changing the defaults for all undefined symbols.

While the issue discussed here is that the very mechanism that drives dynamic_lookup no longer works in binaries using the new "chained fixups" linker feature (which is automatically enabled for newer macOS deployment targets).

wjakob avatar Nov 01 '22 16:11 wjakob

One more data point in case it is useful: explicitly enumerating all relevant Python symbols using the -U command line arguments (https://github.com/wjakob/nanobind/commit/bb211d1889bf18c762510b22df706aecb3d51f8a) also works and the warnings are gone. But it leaves me wondering:

  • Does it actually fix the issue? The manpage says that dynamic_lookup is still used in that case.
  • The linker gets a huge command line argument (3.5KB), which seems quite ugly and would need to be tweaked for every Python version. I could not find a -U_from_file <filename> argument to batch-inform ld about the symbols to accept as being undefined.

wjakob avatar Nov 01 '22 16:11 wjakob

I could not find a -U_from_file <filename> argument to batch-inform ld about the symbols to accept as being undefined.

ld response files might be useful here (though it is documented under Rarely used Options):

     @response_file_path
             Inserts contents of file at response_file_path into arguments. This allows for linker command line args to be store in a
             file.  Note: ld is normally invoked through clang, and clang also interprets @file on the command line.  To have clang
             ignore the @file and pass it through to ld, use -Wl,@file.

carlocab avatar Nov 01 '22 16:11 carlocab

While the issue discussed here is that the very mechanism that drives dynamic_lookup no longer works in binaries using the new "chained fixups" linker feature (which is automatically enabled for newer macOS deployment targets).

Maybe? But if -U works, that's what this should do (IIRC it is, but it has been a long time since I poked it). Of course, the code is also unbuilt because…I have no idea what the instructions are in the first place.

mathstuf avatar Nov 01 '22 18:11 mathstuf

Anyone with Apple clout could push this PR into their attention sphere: apple-opensource/ld64#1

It basically says "here's a library; make any symbol that is resolved here dynamic_lookup and in the flat namespace". Which is…exactly what is wanted here.

I am not sure that anyone at apple monitors that repo. If you check it has no pull requests and nobody ever answered an issue

pablogsal avatar Nov 01 '22 21:11 pablogsal

Yeah, I'm aware. I had contacts at Apple, but they've since moved elsewhere…

mathstuf avatar Nov 02 '22 14:11 mathstuf

I did a bit more investigation with the -U linker argument that enables specifying individual symbols as undefined. The gist is that despite not showing the ominous warning about "chained fixups", it seems to produce shared libraries with the exact same binding table entries as the previous standard solution (-undefined dynamic_lookup).

Links:

  • https://github.com/pybind/pybind11/pull/4301#issuecomment-1300034418
  • https://github.com/pybind/pybind11/pull/4301#issuecomment-1300377998

The third option being discussed (-undefined suppress -flat_namespace) apparently has severe drawbacks that were pointed out by others in the discussion thread.

Somebody from Apple needs to come here and confirm that this warning is a fluke. Otherwise, we have a problem 🙈

wjakob avatar Nov 02 '22 15:11 wjakob

Somebody from Apple needs to come here and confirm that this warning is a fluke. Otherwise, we have a problem 🙈

Yeah, we are unlikely to proceed with anything in CPython unless we have direct authoritative confirmation from Apple that everything is working as we expect and we are not going to have problems down the line.

pablogsal avatar Nov 02 '22 16:11 pablogsal

Hi @wjakob,

We are currently waiting for some clarifications from Apple. Will follow up with anything we find.

My first attempt at contacting Apple failed. Asking developer support resulted in getting forwarded to the forumes to ask about system extensions...

I'll try again after November 14th, before that my agenda is too full to quickly respond.

ronaldoussoren avatar Nov 06 '22 21:11 ronaldoussoren

Somebody from Apple needs to come here and confirm that this warning is a fluke. Otherwise, we have a problem 🙈

My current plan is to try to contact someone from Apple (if he still works there), and if that fails use one of the code-level support tickets included in a paid developer subscription.

ronaldoussoren avatar Nov 06 '22 21:11 ronaldoussoren

Update: I tweeted about this issue to draw some attention to it and was in return asked to file an official bug issue (mirror: https://openradar.appspot.com/radar?id=5536824084660224)

wjakob avatar Nov 08 '22 14:11 wjakob

One thing that can work for shared library builds (including frameworks) is to start linking extensions to libpython, but ensure that @rpath/libpythonX.Y.dylib is used in the link command (while adding an LC_RPATH to the interpreter binary). That won't work for static builds though for obvious reasons.

A bit hand wavy because I haven't actually tested this or thought trough its implications.

ronaldoussoren avatar Nov 12 '22 21:11 ronaldoussoren

That should work with LC_RPATH with an @executable_path/../lib or similar if the python binary is not linked statically.

That won't work for static builds though for obvious reasons.

This is an important use-case though. conda python binaries link statically and would segfault if libpythonX.Y.dylib is loaded.

isuruf avatar Nov 12 '22 21:11 isuruf

Rpath entries are transitive, so you could add an LC_LOAD_DYLIB @rpath/libpythonX.Y.dylib command and just expect the executable to provide it (conda could provide an empty dylib for these purposes) without any LC_RPATH command at all.

Note that I believe that this approach still doesn't fix the "please don't complain about missing symbols" problem that this all started out with.

mathstuf avatar Nov 13 '22 01:11 mathstuf

That should work with LC_RPATH with an @executable_path/../lib or similar if the python binary is not linked statically.

That won't work for static builds though for obvious reasons.

This is an important use-case though. conda python binaries link statically and would segfault if libpythonX.Y.dylib is loaded.

I know. So far I've only been able to find partial fixes that work with some of the use cases but not all:

  • -bundle_loader {interpreter_binary} works for static builds, but not shared library builds
  • -rpath ... works for shared library builds, but not static builds

Ideally we'd end up with a solution that works with -undefined error because that gives a better development experience when working on C extensions.

ronaldoussoren avatar Nov 14 '22 09:11 ronaldoussoren

None of the workarounds mentioned in the last few posts are truly general.

For example, in Blender, everything including Python is part of a single binary Blender.app/Contents/MacOS/Blender. It does not include a separate libpython.dylib.

That's why I think the ball is now in @apple's court. They either need to stabilize the existing mechanism or propose a new one.

wjakob avatar Nov 14 '22 09:11 wjakob