Fixing search path of boostrapped Python toolchain dylib is not hermetic
Change #2089 breaks multi-platform builds. The contents of libpython.dylib differ depending on which host machine runs the build, since if "darwin" in platform and "osx" == repo_utils.get_platforms_os_name(rctx) checks the host machine not the execution platform. This change, if it has to happen conditionally, needs to happen in a build target instead.
$ diffoscope /tmp/local.libpython3.11.dylib /tmp/remote.libpython3.11.dylib
--- /tmp/local.libpython3.11.dylib
+++ /tmp/remote.libpython3.11.dylib
├── arm64
│ ├── otool -arch arm64 -h {}
│ │ @@ -1,3 +1,3 @@
│ │ Mach header
│ │ magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
│ │ - 0xfeedfacf 16777228 0 0x00 6 32 4544 0x00100085
│ │ + 0xfeedfacf 16777228 0 0x00 6 32 4552 0x00100085
│ ├── otool -arch arm64 -L {}
│ │ @@ -1,8 +1,8 @@
│ │ - @rpath/libpython3.11.dylib (compatibility version 3.11.0, current version 3.11.0)
│ │ + /install/lib/libpython3.11.dylib (compatibility version 3.11.0, current version 3.11.0)
│ │ /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.61.1)
│ │ /usr/lib/libncurses.5.4.dylib (compatibility version 5.4.0, current version 5.4.0)
│ │ /usr/lib/libpanel.5.4.dylib (compatibility version 5.4.0, current version 5.4.0)
│ │ /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 2202.0.0)
│ │ /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0)
│ │ /System/Library/Frameworks/SystemConfiguration.framework/Versions/A/SystemConfiguration (compatibility version 1.0.0, current version 1296.60.3)
Originally posted by @brentleyjones in https://github.com/bazel-contrib/rules_python/issues/2089#issuecomment-3234021683
I wrote some internal rule for this:
"""Process a shared library, fixing its install name"""
load("@bazel_skylib//lib:paths.bzl", "paths")
load("@bazel_tools//tools/cpp:toolchain_utils.bzl", "find_cpp_toolchain", "use_cpp_toolchain")
load("@rules_cc//cc/common:cc_common.bzl", "cc_common")
load("@rules_cc//cc/common:cc_info.bzl", "CcInfo")
def _impl(ctx):
if len(ctx.attr.deps) != 1:
fail("Only one dep is currently supported")
cc_toolchain = find_cpp_toolchain(ctx)
feature_configuration = cc_common.configure_features(
ctx = ctx,
cc_toolchain = cc_toolchain,
requested_features = ctx.features,
unsupported_features = ctx.disabled_features,
)
target = "macos"
if ctx.target_platform_has_constraint(ctx.attr._linux_constraint[platform_common.ConstraintValueInfo]):
target = "linux"
dep = ctx.attr.deps[0]
compilation_context = dep[CcInfo].compilation_context
linking_context = dep[CcInfo].linking_context
library_names = []
libraries = []
outputs = []
for linker_input in linking_context.linker_inputs.to_list():
for library in linker_input.libraries:
if not library.dynamic_library:
fail("Expected shared library:", library)
library_name = library.dynamic_library.basename
library_names.append(library_name)
if ctx.attr.libraries and library_name not in ctx.attr.libraries:
continue
output = ctx.actions.declare_file(paths.join(ctx.attr.output_prefix, library_name))
outputs.append(output)
libraries.append(
cc_common.create_library_to_link(
actions = ctx.actions,
cc_toolchain = cc_toolchain,
feature_configuration = feature_configuration,
dynamic_library = output,
),
)
ctx.actions.run_shell(
tools = [
ctx.executable._install_name_tool,
ctx.executable._patchelf,
],
inputs = [library.dynamic_library],
outputs = [output],
command = """
set -euo pipefail
cp $1 $2
chmod u+w $2
if [[ "{target}" == "macos" ]]; then
{install_name_tool} -id @rpath/{libname} $2
else
{patchelf} --set-soname {libname} $2
fi
""".format(
install_name_tool = ctx.executable._install_name_tool.path,
libname = library_name,
patchelf = ctx.executable._patchelf.path,
target = target,
),
arguments = [
library.dynamic_library.path,
output.path,
],
)
if not outputs:
fail("Must process at least 1 library. Options: ", library_names)
return [
DefaultInfo(files = depset(outputs)),
CcInfo(
compilation_context = compilation_context,
linking_context = cc_common.create_linking_context(
linker_inputs = depset([
cc_common.create_linker_input(
owner = ctx.label,
libraries = depset(libraries),
),
]),
),
),
]
install_name = rule(
implementation = _impl,
attrs = {
"deps": attr.label_list(providers = [CcInfo]),
"output_prefix": attr.string(default = ""),
"libraries": attr.string_list(default = []),
"_install_name_tool": attr.label(
default = "@llvm-project//llvm:llvm-install-name-tool",
cfg = "exec",
executable = True,
),
"_patchelf": attr.label(
default = "@patchelf//:patchelf",
cfg = "exec",
executable = True,
),
"_linux_constraint": attr.label(default = Label("@platforms//os:linux")),
},
provides = [CcInfo],
toolchains = use_cpp_toolchain(),
fragments = ["cpp"],
)
but the hard part is getting a binary for llvm-install-name-tool
patching should happen in build phase
Yes. The fix here is straight forward:
- Add keith's rule to the hermetic runtime setup macro code
- Instead of putting the source file
lib/python.dylibinto the cc_library, put the output of the patching rule. Guard this with a select() for mac.
getting llvm-install-name-tool
Is it not in CcToolchain somewhere? IIRC, there's an "all_file" attribute of CcToolchain. I've had to dig around in there to get some of the more esoteric tools. Otherwise, maybe someone on a CC or Mac channel on slack would know.
Is it not in CcToolchain somewhere? IIRC, there's an "all_file" attribute of CcToolchain. I've had to dig around in there to get some of the more esoteric tools. Otherwise, maybe someone on a CC or Mac channel on slack would know.
on macOS the toolchain exposes the host tools from Xcode, and that has it. It would probably be ideal if it was fetched from some other toolchain type, since as you can see in our case above we build that tool from source instead, and also avoid the Xcode dep that way. On Linux it definitely doesn't exist unless the user has a custom toolchain w/ llvm tools
One hitch that might come up: we might need to copy the python executable. A suspicion I have is, when bazel symlinks the python exe from runfiles back to the repo, Python will use the repo location, not runfiles, for its home. Making python itself a copy should prevent that
FWIW I also think we should just fix this upstream in the prebuilt version, considering it does this for linux correctly and there's no value in the current path since it will always be invalid. I imagine it was an oversight
Yeah, +1 to that
https://github.com/astral-sh/python-build-standalone/pull/914