rules_python icon indicating copy to clipboard operation
rules_python copied to clipboard

fix: Add `libs/python3.lib` to libpython target for SABI builds on Windows

Open nicholasjng opened this issue 1 year ago • 27 comments

When targeting the Python Stable ABI on Windows (by setting the Py_LIMITED_API macro to a Python minimum version hex), the unversioned python3.lib needs to be linked instead of the versioned one (e.g. python38.lib for Python 3.8).

Python's own config sets the library to link by default in a header called pyconfig.h (https://github.com/python/cpython/blob/9cc9e277254023c0ca08e1a9e379fd89475ca9c2/PC/pyconfig.h#L270), which prompts the linker to search for python3.lib if a stable ABI extension is built using @rules_python toolchains.

Since this library is not exported on Windows in the python_repository() rule, building Python C++ extensions with rules_python toolchains fails in the linking step, because the library is never copied. Consequently, it is added now to allow Python SABI extensions to be built (and linked) on Windows with @rules_python.

Since Python takes responsibility for linking the correct lib on Windows, and never both at the same time, no other changes are made.

nicholasjng avatar Mar 22 '24 14:03 nicholasjng

I just tried out the method mentioned in your fix. I made a small cc_binary

cc_binary(
    name = "embed",
    srcs = ["embed.cpp"],
    deps = [
        "@rules_python//python/cc:current_py_cc_headers",
        "@rules_python//python/cc:current_py_cc_libs",
    ],
)
//embed.cpp

#include <iostream>

#define Py_LIMITED_ABI 
#include <Python.h>

int main() { 
		std::cout << "Starting main" << "\n";
		std::cout << "Ending main" << "\n";
    return 0;
}

I looked at the resulting linker commands.

/nologo
/OUT:bazel-out/x64_windows-opt/bin/python/experimental/embed.exe
bazel-out/x64_windows-opt/bin/python/experimental/_objs/embed/embed.obj
external/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/libs/python311.lib
external/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/libs/python3.lib
/SUBSYSTEM:CONSOLE
/MACHINE:X64
/DEFAULTLIB:msvcrt.lib
/OPT:ICF
/OPT:REF

The compiler is now trying to link both python311.lib and python3.lib. I'm not sure what the implications are here and I'm surprised there are no duplicate definition errors produced. Do you know if this is a problem?

axbycc-mark avatar Mar 24 '24 17:03 axbycc-mark

Yes, that is potentially a problem - when building against the stable ABI, you do not want to link against the versioned python3XY.lib.

This would probably be fine if the python3.lib appeared first in the command. (Without knowing the intricacies of the MSVC linker, I'm guessing the first resolved symbol is taken greedily.)

But, I'm more curious about the defines here: I'm mentioning Py_LIMITED_API, since that designates python3.lib in that pyconfig.h snippet. You don't seem to be defining that either in the code, or in the cc_binary.

Can you share the linker flags when building with cc_binary(..., defines=["Py_LIMITED_API=0x030A0000"]) or something similar?

nicholasjng avatar Mar 24 '24 17:03 nicholasjng

Sorry, my mistake. I have been misreading "LIMITED_API" as "LIMITED_ABI" the whole time. Let me try once more with the correct spelling.

Update: Same issue with the fixed spelling, now inside the defines=[...] attribute as requested. I verified the object file was compiled with Py_LIMITED_API by looking at the compilation flags which appear below.


/nologo
/DCOMPILER_MSVC
/DNOMINMAX
/D_WIN32_WINNT=0x0601
/D_CRT_SECURE_NO_DEPRECATE
/D_CRT_SECURE_NO_WARNINGS
/bigobj
/Zm500
/EHsc
/wd4351
/wd4291
/wd4250
/wd4996
/I.
/Ibazel-out/x64_windows-opt/bin
/Iexternal/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc
/Ibazel-out/x64_windows-opt/bin/external/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc
/Iexternal/bazel_tools
/Ibazel-out/x64_windows-opt/bin/external/bazel_tools
/Iexternal/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/include
/Ibazel-out/x64_windows-opt/bin/external/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/include
/Iexternal/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/include/python3.11
/Ibazel-out/x64_windows-opt/bin/external/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/include/python3.11
/Iexternal/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/include/python3.11m
/Ibazel-out/x64_windows-opt/bin/external/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/include/python3.11m
/DPy_LIMITED_API=0x030A0000
/showIncludes
/MD
/O2
/Oy-
/DNDEBUG
/wd4117
-D__DATE__=\"redacted\"
-D__TIMESTAMP__=\"redacted\"
-D__TIME__=\"redacted\"
/Gy
/Gw
-DWIN32_LEAN_AND_MEAN
-DNOGDI
/std:c++20
/Fobazel-out/x64_windows-opt/bin/python/experimental/_objs/embed/embed.obj
/c
python/experimental/embed.cpp

I guess whether you define the Py_LIMITED_API or not, the libraries get linked anyway due to Bazel's rules. It makes sense to me that Bazel would instruct the compiler to link all srcs mentioned in cc_library dependency.

axbycc-mark avatar Mar 24 '24 18:03 axbycc-mark

I mean, so much is expected - the headers appearing in your compilation command there are always versioned to the Python distribution they are from.

Also, from the pyconfig.h snippet, the conditional declaration of either python3.lib or python3X.lib means that the @rules_python//python/cc:current_py_cc_libs need not appear as deps at all if you're building a Python extension on Windows, the libs only need to be both present for the hermetic Python.

You can see that this is true e.g. by checking the CI statuses of https://github.com/nicholasjng/nanobind-bazel/pull/16, where I completely removed the current_py_cc_libs from all default depsets of my Python targets in the second commit of the changeset. On Windows, Python 3.8-3.11 (all without the stable ABI) proceed to build just fine.

TL,DR: Explicitly specifying the current libs on Windows is normally not necessary because of the selection mechanism in pyconfig.h, and perhaps counterproductive for the reasons you mentioned (Bazel links all libs appearing as deps in a command).

nicholasjng avatar Mar 24 '24 18:03 nicholasjng

I'm thinking about what you said here.

Also, from the pyconfig.h snippet, the conditional declaration of either python3.lib or python3X.lib means that the @rules_python//python/cc:current_py_cc_libs need not appear as deps at all if you're building a Python extension on Windows, the libs only need to be both present for the hermetic Python.

In my project, I have a python_repository provided by python_rules for Python version 3.11. It has registered this version as the default Python toolchain. Outside of the project folder, my system has no globally installed Python version 3.11. Within my project, I want to embed a Python interpreter / write Python extensions for Python 3.11.

I am hitting the hermetic Python use case that you are mentioning right? If I don't include these dependencies, the #pragma will still try to link python311.lib and I will get a linker error (if I still include the current_py_cc_headers).

On the other hand, if I installed Python 3.10 on my system level (and don't include current_py_cc_headers or current_py_cc_libs), msvc would find the system python310.lib/python310.dll and link my project's binaries against those on account of the #pragma directives. But if I write an extension this way, and use it as a dependency for a py_binary, is it going to cause issues? Both python310.dll and python311.dll will be loaded at runtime, defining many of the same symbols. So if I take that route, I must be sure that the system python matches the hermetic python.

Does that match your understanding?

axbycc-mark avatar Mar 24 '24 19:03 axbycc-mark

I think we mean different things: I am referring to "hermetic Python" as in "a configured rules_python toolchain", not as an embedded interpreter. MSVC invocations will not link against anything on the system other than the configured rules_python toolchain. (To verify, see the 3.12 build failure in the PR that I linked - I install Python 3.12 on the runner early on, but the build still fails due to a linker error, even if the system Python 3.12 has all of the required libs.)

The only way to screw up hermeticity is to link libs from the system Python explicitly in your package setup code, i.e. by supplying --linkopt=/LIBPATH:C:\my\system\python\libs in Python. If you build everything in Bazel anyway, you're good.

EDIT: Ah, now I get it. Yes, the current libs will be taken from the rules_python toolchain as well - if you want to redistribute SABI extensions, you package python3.{lib,dll}, if not, the versioned ones.

nicholasjng avatar Mar 24 '24 19:03 nicholasjng

Okay I still can't get my head around how you can remove the current_py_cc_libs dependency and still have your cc_library compile. On my computer I get the following error, which from my point of view is completely expected. Without the dependency, msvc is never told where the lib file is.

LINK : fatal error LNK1104: cannot open file 'python3.lib'
Target //python/experimental:embed failed to build

But I think I'm a little out of my depth here so I'll need to go off on my own and read more about Windows/Bazel/Python.

Unrelatedly, I did uncover something interesting. Using Process Explorer and launching the Python interpreter, I found it's completely normal for the Python process to load in both python3.dll and pythonxy.dll. image

But this may just be because there are transitive dependencies being imported at startup, where every single pyd only depends on either python3 or pythonxy.

In any case I imagine it is safe to add the python*.dll into the sources in your PR which would fix #1823

Thanks for taking up this issue 👍

axbycc-mark avatar Mar 25 '24 02:03 axbycc-mark

I don't have a Windows machine around so I am happy that you are investigating this and attempting to make building on Windows better.

Let me know when you would like a review from the maintainer point of view. The checklist would be:

  • Include the backstory within the PR in the linked issue (#1823) as the PR description is too long - it is going to be used as a commit message.
  • Add a CHANGELOG.md when it is somewhat ready.
  • Consider adding tests that work on Windows as well.

Let me know if you would like any ideas/suggestions from me or other maintainers.

aignas avatar Mar 26 '24 03:03 aignas

@aignas Yes please, this is ready for review.

I can copy the backstory over to a comment in the thread and paste it here, the issue you mentioned is technically about a slightly different use.

Is the PR title (or a slightly more elaborate description thereof) sufficient as a changelog entry?

PS: If you're fine with it, I would add the other versioned DLL mentioned in #1823 to the list, and then it should actually also fix that issue.

nicholasjng avatar Mar 26 '24 07:03 nicholasjng

The PR description is the commit message once the PR is merged and the changelog.md file still needs to be modified manually. As for the backstory, having it as a comment in this PR may be also sufficient. No need to create an issue just for that.

I am curious if it would be possible to add a test target (maybe under tests/toolchains?) Where we could ensure that the new code works?

aignas avatar Mar 26 '24 11:03 aignas

The following is the backstory of this PR, previously found in the description.


Backstory

I am currently implementing Bazel support for the nanobind project, which makes creation of C++ Python bindings very easy. It is the successor to pybind11 in that regard. Development efforts on this happen in the nanobind-bazel repository.

One particular feature in nanobind is targeting the Python stable ABI, which can be used to shrink the build matrix for Python wheels by promising compatibility across minor versions of Python even with C++ extensions. Due to later additions of nanobind prerequisites to the Python limited API, targeting the stable ABI using nanobind only became possible starting with Python 3.12.

I added a Bazel config to the easiest, "hello world"-ish nanobind example project here, which used to build just fine on all platforms (Win/MacOS/Linux) and versions (Python 3.8-3.12), but only because I was breaking hermeticity in the Windows case, passing the libdir of the system interpreter to the build as a linkopt on Windows in Python (in the setup.py, to be specific).

(For an example of where this is still done, see https://github.com/google/benchmark/blob/d5c55e8c42a8782cb24f6011d0e88449237ab842/setup.py#L71-L74.)

Once I stopped doing that, I started getting build errors for Win+Python 3.12, as for example in this PR: https://github.com/nicholasjng/nanobind-bazel/pull/16. Note that when building for Python 3.12, I target the stable ABI by setting the Py_LIMITED_API macro to 3.12.

As would be expected from pyconfig.h taking measures to target the unversioned library, Python itself mentions that python3.lib needs to be linked in place of python3XY.lib when targeting the stable ABI: https://docs.python.org/3/c-api/stable.html#stable-abi

As a final point, none of this is an issue on Unix platforms, since those get their Python symbols dynamically at runtime without needing any linkage.

Please let me know your thoughts.

nicholasjng avatar Mar 26 '24 16:03 nicholasjng

I added the changelog entry, and split off the backstory into a comment, removing it from the PR description.

I also took the liberty to add the pythonXY.dll DLL file, as prompted by the linked issue.

As for tests - I believe that since MSVC decides the library to link in the pyconfig.h header, we can add a regression test for this issue by compiling a simple cc_binary with the Py_LIMITED_API macro defined. The only dependency needed are the current cc headers, so this should be grouped with the cc headers tests. Since I have no experience with writing tests for Bazel, I would welcome some guidance.

nicholasjng avatar Mar 26 '24 16:03 nicholasjng

I think //tests/cc/current_py_cc_libs:python_libs_linking_test is what you want? I created a simple cc_test to verify some linking behavior, but it didn't pass on windows and I couldn't figure out why. This thread sounds like the reason?

Since I have no experience with writing tests for Bazel, I would welcome some guidance.

You're probably looking at e.g. tests/cc/current_py_cc_headers/current_py_cc_headers_tests.bzl ? Those are called "analysis tests" and are for verifying that the bzl logic in rule code is working. I wouldn't recommend trying to use those for this case -- what you have to do is look into the linker args and try to figure out if it looks right, which is fairly brittle and painful. I would just create cc_test targets that are built/run instead.

rickeylev avatar Mar 26 '24 17:03 rickeylev

I think //tests/cc/current_py_cc_libs:python_libs_linking_test is what you want? I created a simple cc_test to verify some linking behavior, but it didn't pass on windows and I couldn't figure out why. This thread sounds like the reason?

Yes, that should be it. What do I need to expect in _solib for Windows now? Probably python3.lib, right?

https://github.com/bazelbuild/rules_python/blob/c5c03b2477dd1ce0c06c9dc60bf816995f222bcf/tests/cc/current_py_cc_libs/current_py_cc_libs_tests.bzl#L52-L57

nicholasjng avatar Mar 26 '24 19:03 nicholasjng

Yeah. Just use python3. instead of python3.so. The extension can vary, so I don't think there's a point in trying to match it.

rickeylev avatar Mar 26 '24 20:03 rickeylev

Local Mac is still broken with matching.str_matches("*_solib*/*python3."):

FAIL: //tests/cc/current_py_cc_libs:python_libs_linking_test (see /private/var/tmp/_bazel_nicholasjunge/1a11c1f7d001d9440f87afb0306a8075/execroot/_main/bazel-out/darwin_arm64-fastbuild/testlogs/tests/cc/current_py_cc_libs/python_libs_linking_test/test.log)
INFO: From Testing //tests/cc/current_py_cc_libs:python_libs_linking_test:
==================== Test output for //tests/cc/current_py_cc_libs:python_libs_linking_test:
dyld[10462]: Library not loaded: /install/lib/libpython3.11.dylib
  Referenced from: <BED7F7C3-FAFD-377B-9B37-AEA0F9EC10C8> /private/var/tmp/_bazel_nicholasjunge/1a11c1f7d001d9440f87afb0306a8075/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/tests/cc/current_py_cc_libs/python_libs_linking_test
  Reason: tried: '/install/lib/libpython3.11.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/install/lib/libpython3.11.dylib' (no such file), '/install/lib/libpython3.11.dylib' (no such file)

Looks like the library search path is messed up? (FWIW, I do see libpython3.11.dylib under the _solib_darwin_arm64 directory in bazel-bin, although in a randomly generated subfolder.)

nicholasjng avatar Mar 26 '24 20:03 nicholasjng

I think the current libs test is fundamentally broken on MacOS (and potentially Windows too). The linkage of the python_libs_linking_test is off:

➜ otool -L bazel-out/darwin_arm64-fastbuild/bin/tests/cc/current_py_cc_libs/python_libs_linking_test
bazel-out/darwin_arm64-fastbuild/bin/tests/cc/current_py_cc_libs/python_libs_linking_test:
        /install/lib/libpython3.11.dylib (compatibility version 3.11.0, current version 3.11.0)
        /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1700.255.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)
        /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 2420.0.0)
        /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0)

The first path is the baked-in dylib path of the hermetic Python's libpythonXY.dylib. The linker options look good for fixing up the path to the actual dylib, but it seems they are not respected:

# python_libs_linking_test-2.params

-o
bazel-out/darwin_arm64-fastbuild/bin/tests/cc/current_py_cc_libs/python_libs_linking_test
-Xlinker
-rpath
-Xlinker
@loader_path/../../../_solib_darwin_arm64/_U@@_Umain~python~python_U3_U11_Uaarch64-apple-darwin_S_S_Clibpython___Ulib
-Xlinker
-rpath
-Xlinker
@loader_path/python_libs_linking_test.runfiles/_main/_solib_darwin_arm64/_U@@_Umain~python~python_U3_U11_Uaarch64-apple-darwin_S_S_Clibpython___Ulib
-Lbazel-out/darwin_arm64-fastbuild/bin/_solib_darwin_arm64/_U@@_Umain~python~python_U3_U11_Uaarch64-apple-darwin_S_S_Clibpython___Ulib
bazel-out/darwin_arm64-fastbuild/bin/tests/cc/current_py_cc_libs/_objs/python_libs_linking_test/python_libs_linking_test.o
-lpython3.11
-Wl,-S
-mmacos-version-min=10.11
-no-canonical-prefixes
-fobjc-link-runtime
-headerpad_max_install_names
-lc++
-lm

Not sure what to do here. I think the best option is to create a new cc_test with Py_LIMITED_API defined, and restrict it to Windows only.

nicholasjng avatar Mar 27 '24 12:03 nicholasjng

This patch does not fix the SABI build failure in my project, see https://github.com/nicholasjng/nanobind-bazel/pull/18 and the CI run therein.

It seems that the current_py_cc_libs target is more broken than initially anticipated, perhaps because no rpath fixup happens for the shared libs in the repository template.

Possibly related thread (for MacOS): https://github.com/pyinstaller/pyinstaller/issues/7582

nicholasjng avatar Mar 27 '24 15:03 nicholasjng

more broken than anticipated

I'm not too surprised by that :). There's lots of platform-specific idiosyncrasies and I'm by no means an linker expert, even less so with Windows and Mac. I don't really have a Mac or Windows machine to experiment with, either. I think the reality is someone with more knowledge/experience/motivation for those platforms will need to step up.

The code powering the current_py_cc_libs stuff is pretty simple. All it's doing is forwarding along the cc_library() defined in the runtime's repo. If someone can craft a cc_library that represents python libs coming from within the build, then I can help translate it into the necessary toolchain code.

create a new cc_test with Py_LIMITED_API defined, and restrict it to Windows only.

Yes, that's fine. Incremental progress is good.

linkage and rpath issues

That /install/lib/libpython3.11.dylib entry does look a bit odd. It shouldn't be finding libraries from outside the runfiles when those libraries are coming from a cc_library in the build. What does the binary's RPATH, DTNEEDED, and runfiles solib directory look like?

From what I understand of Bazel's cc_test[1], the way it should be working is:

  • The cc_test has a DTNEEDED entry added for libpython
  • The runfiles solib directory will, somewhere, contain libpython.so
  • The cc_test has RPATH pointing to the runfiles solib directory

There might be some filename mangling and SONAME trickery, but I can't recall.

  • cc_test might have DTNEEDED=_S_S_blabla_S_libpython3 and a corresponding file with a rewritten SONAME in soblib
  • cc_test might have DTNEEDED=libpython3 and solib will have e.g. runfiles/solib/_S_S_blabla/libpython.dylib (symlink to actual file)

[1] cc_binary might behave differently btw. I recall seeing some code paths in the cc rule implementation that special cases cc_test vs cc_binary, but what activated them was convoluted at times, so ymmv.

rickeylev avatar Mar 28 '24 16:03 rickeylev

Not to worry, I'm here to (hopefully) see this through. I'm also not an expert, and every new linker topic I see prompts an extensive google search, but I'll get there :)

It seems that on macOS, a binary called install_name_tool is responsible for overwriting / setting rpaths and loader paths in dylibs. That could be an option, but it's in all likelihood not portable.

As a note, here is a blog post detailing the practice when bundling dylibs on macOS.

Here's the full otool -l output on the test binary:

Output
➜ otool -l /Users/nicholasjunge/Workspaces/c++/rules_python/bazel-bin/tests/cc/current_py_cc_libs/python_libs_linking_test
/Users/nicholasjunge/Workspaces/c++/rules_python/bazel-bin/tests/cc/current_py_cc_libs/python_libs_linking_test:
Load command 0
      cmd LC_SEGMENT_64
  cmdsize 72
  segname __PAGEZERO
   vmaddr 0x0000000000000000
   vmsize 0x0000000100000000
  fileoff 0
 filesize 0
  maxprot 0x00000000
 initprot 0x00000000
   nsects 0
    flags 0x0
Load command 1
      cmd LC_SEGMENT_64
  cmdsize 392
  segname __TEXT
   vmaddr 0x0000000100000000
   vmsize 0x0000000000004000
  fileoff 0
 filesize 16384
  maxprot 0x00000005
 initprot 0x00000005
   nsects 4
    flags 0x0
Section
  sectname __text
   segname __TEXT
      addr 0x0000000100003f08
      size 0x0000000000000064
    offset 16136
     align 2^2 (4)
    reloff 0
    nreloc 0
     flags 0x80000400
 reserved1 0
 reserved2 0
Section
  sectname __stubs
   segname __TEXT
      addr 0x0000000100003f6c
      size 0x0000000000000024
    offset 16236
     align 2^2 (4)
    reloff 0
    nreloc 0
     flags 0x80000408
 reserved1 0 (index into indirect symbol table)
 reserved2 12 (size of stubs)
Section
  sectname __cstring
   segname __TEXT
      addr 0x0000000100003f90
      size 0x0000000000000017
    offset 16272
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000002
 reserved1 0
 reserved2 0
Section
  sectname __unwind_info
   segname __TEXT
      addr 0x0000000100003fa8
      size 0x0000000000000058
    offset 16296
     align 2^2 (4)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Load command 2
      cmd LC_SEGMENT_64
  cmdsize 152
  segname __DATA_CONST
   vmaddr 0x0000000100004000
   vmsize 0x0000000000004000
  fileoff 16384
 filesize 16384
  maxprot 0x00000003
 initprot 0x00000003
   nsects 1
    flags 0x10
Section
  sectname __got
   segname __DATA_CONST
      addr 0x0000000100004000
      size 0x0000000000000018
    offset 16384
     align 2^3 (8)
    reloff 0
    nreloc 0
     flags 0x00000006
 reserved1 3 (index into indirect symbol table)
 reserved2 0
Load command 3
      cmd LC_SEGMENT_64
  cmdsize 72
  segname __LINKEDIT
   vmaddr 0x0000000100008000
   vmsize 0x0000000000004000
  fileoff 32768
 filesize 824
  maxprot 0x00000001
 initprot 0x00000001
   nsects 0
    flags 0x0
Load command 4
      cmd LC_DYLD_CHAINED_FIXUPS
  cmdsize 16
  dataoff 32768
 datasize 152
Load command 5
      cmd LC_DYLD_EXPORTS_TRIE
  cmdsize 16
  dataoff 32920
 datasize 48
Load command 6
     cmd LC_SYMTAB
 cmdsize 24
  symoff 32976
   nsyms 5
  stroff 33080
 strsize 88
Load command 7
            cmd LC_DYSYMTAB
        cmdsize 80
      ilocalsym 0
      nlocalsym 0
     iextdefsym 0
     nextdefsym 2
      iundefsym 2
      nundefsym 3
         tocoff 0
           ntoc 0
      modtaboff 0
        nmodtab 0
   extrefsymoff 0
    nextrefsyms 0
 indirectsymoff 33056
  nindirectsyms 6
      extreloff 0
        nextrel 0
      locreloff 0
        nlocrel 0
Load command 8
          cmd LC_LOAD_DYLINKER
      cmdsize 32
         name /usr/lib/dyld (offset 12)
Load command 9
     cmd LC_UUID
 cmdsize 24
    uuid BED7F7C3-FAFD-377B-9B37-AEA0F9EC10C8
Load command 10
      cmd LC_BUILD_VERSION
  cmdsize 32
 platform 1
    minos 14.4
      sdk 14.4
   ntools 1
     tool 3
  version 1053.12
Load command 11
      cmd LC_SOURCE_VERSION
  cmdsize 16
  version 0.0
Load command 12
       cmd LC_MAIN
   cmdsize 24
  entryoff 16136
 stacksize 0
Load command 13
          cmd LC_LOAD_DYLIB
      cmdsize 64
         name /install/lib/libpython3.11.dylib (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 3.11.0
compatibility version 3.11.0
Load command 14
          cmd LC_LOAD_DYLIB
      cmdsize 48
         name /usr/lib/libc++.1.dylib (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 1700.255.0
compatibility version 1.0.0
Load command 15
          cmd LC_LOAD_DYLIB
      cmdsize 56
         name /usr/lib/libSystem.B.dylib (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 1345.100.2
compatibility version 1.0.0
Load command 16
          cmd LC_LOAD_DYLIB
      cmdsize 96
         name /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 2420.0.0
compatibility version 300.0.0
Load command 17
          cmd LC_LOAD_DYLIB
      cmdsize 56
         name /usr/lib/libobjc.A.dylib (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 228.0.0
compatibility version 1.0.0
Load command 18
          cmd LC_RPATH
      cmdsize 136
         path @loader_path/../../../_solib_darwin_arm64/_U@@_Umain~python~python_U3_U11_Uaarch64-apple-darwin_S_S_Clibpython___Ulib (offset 12)
Load command 19
          cmd LC_RPATH
      cmdsize 168
         path @loader_path/python_libs_linking_test.runfiles/_main/_solib_darwin_arm64/_U@@_Umain~python~python_U3_U11_Uaarch64-apple-darwin_S_S_Clibpython___Ulib (offset 12)
Load command 20
      cmd LC_FUNCTION_STARTS
  cmdsize 16
  dataoff 32968
 datasize 8
Load command 21
      cmd LC_DATA_IN_CODE
  cmdsize 16
  dataoff 32976
 datasize 0
Load command 22
      cmd LC_CODE_SIGNATURE
  cmdsize 16
  dataoff 33168
 datasize 424

It looks like the first LC_LOAD_DYLIB entry (for libpython3.11) is the culprit, but the rpath that Bazel set with the linker is correct.

I'll try to add the test shortly, would appreciate feedback as your time permits.

nicholasjng avatar Mar 28 '24 18:03 nicholasjng

It looks like the first LC_LOAD_DYLIB entry (for libpython3.11) is the culprit, but the rpath that Bazel set with the linker is correct.

Yes, I agree. My guess is something is missing from the linker command to generate an LC_LOAD_DYLIB entry like @rpath/...something..., like mentioned in that web page you linked to. There is also a cc_import() rule; IDK if that will help, but you could try it.

rickeylev avatar Mar 28 '24 18:03 rickeylev

I added a Windows test with Py_LIMITED_API defined to make MSVC look for libs/python3.lib. It basically reproduces the issue I saw in my nanobind repo.

nicholasjng avatar Mar 28 '24 19:03 nicholasjng

@rickeylev I think the new cc_import in https://github.com/bazelbuild/rules_python/pull/1820/commits/c567f70ff2c7d0049c40982a05de06c69db7bdb3 did something, I'm just not sure what the issue is now. The buildkite artifacts don't seem to contain any useful information on the failures.

nicholasjng avatar Apr 08 '24 09:04 nicholasjng

Wow, empty output? That's sort of impressive. I'd have expected like a message about segfaulting or something.

In any case, perhaps line 5 in python_libs_linking_test isn't actually triggering? That C++ code is buildable, but not runnable. That it's getting past the build step means the (build time) linking is happy enough. That it's failing at runtime is expected.

rickeylev avatar Apr 08 '24 16:04 rickeylev

Ah, nice! Do you have a preference on how to proceed? I'm guessing it's not as easy as adding an early return 0, since the compiler might optimize the symbols away.

nicholasjng avatar Apr 08 '24 17:04 nicholasjng

Yeah, that's why the condition it checks is argc -- that's a runtime condition, so it can't optimize them away.

Maybe print out argv or env vars and see if there's something to trigger in there? If not, adding something to args/env of the cc_test target would let us set a signal to check.

rickeylev avatar Apr 08 '24 18:04 rickeylev

Just tried to set an envvar, but that evidently did not work. Am I on the right track?

(I still can't run this test locally because of the macOS dylib failure, btw, so I am a little slow here.)

nicholasjng avatar Apr 08 '24 18:04 nicholasjng