rules_py icon indicating copy to clipboard operation
rules_py copied to clipboard

[Bug]: py_image_layer seems to embed the host version of the libraries in cross built container

Open remiphilippe opened this issue 10 months ago • 2 comments

What happened?

When trying to run a container which is build on Mac OS (Intel), the container will fail on a grpc dependency:

root@3e4086586e34:/# /python/binary/binary_bin
Traceback (most recent call last):
  File "/python/binary/binary_bin.runfiles/_main/python/binary/__main__.py", line 7, in <module>
    from python.binary.credentials import load_credentials
  File "/python/binary/binary_bin.runfiles/_main/python/binary/credentials.py", line 3, in <module>
    from google.cloud import secretmanager
  File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/google/cloud/secretmanager/__init__.py", line 21, in <module>
    from google.cloud.secretmanager_v1.services.secret_manager_service.async_client import (
  File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/google/cloud/secretmanager_v1/__init__.py", line 21, in <module>
    from .services.secret_manager_service import (
  File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/google/cloud/secretmanager_v1/services/secret_manager_service/__init__.py", line 16, in <module>
    from .async_client import SecretManagerServiceAsyncClient
  File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/google/cloud/secretmanager_v1/services/secret_manager_service/async_client.py", line 33, in <module>
    from google.api_core import gapic_v1
  File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/google/api_core/gapic_v1/__init__.py", line 16, in <module>
    from google.api_core.gapic_v1 import config
  File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/google/api_core/gapic_v1/config.py", line 23, in <module>
    import grpc
  File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/grpc/__init__.py", line 22, in <module>
    from grpc import _compression
  File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/grpc/_compression.py", line 20, in <module>
    from grpc._cython import cygrpc
ImportError: cannot import name 'cygrpc' from 'grpc._cython' (/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/grpc/_cython/__init__.py)

The problem is that the dependency is loaded with the host libs, instead of linux amd64:

root@3e4086586e34:/# ls /python/binary/binary_bin.runfiles/.crawler_bin.venv/lib/python3.11/site-packages/grpc/_cython/
__init__.py  _credentials  _cygrpc  cygrpc.cpython-311-darwin.so

Version

Development (host) and target OS/architectures: MacOS x64 (

Output of bazel --version: bazel 7.4.1-homebrew

Version of the Aspect rules, or other relevant rules from your WORKSPACE or MODULE.bazel file: bazel_dep(name = "rules_python", version = "1.1.0", dev_dependency = True) bazel_dep(name = "rules_python_gazelle_plugin", version = "1.1.0", dev_dependency = True)

bazel_dep(name = "aspect_rules_py", version = "1.1.0")

Language(s) and/or frameworks involved: Python using google cloud apis

How to reproduce

parent (//python) BUILD.bazel

load("@gazelle//:def.bzl", "gazelle")
load("@pip//:requirements.bzl", "all_whl_requirements")
load("@rules_python//python:pip.bzl", "compile_pip_requirements")
load("@rules_python//python:py_binary.bzl", "py_binary")
load("@rules_python//python:py_library.bzl", "py_library")
load("@rules_python//python:py_test.bzl", "py_test")
load("@rules_python_gazelle_plugin//manifest:defs.bzl", "gazelle_python_manifest")
load("@rules_python_gazelle_plugin//modules_mapping:def.bzl", "modules_mapping")

# gazelle:map_kind py_library py_library @aspect_rules_py//py:defs.bzl
# gazelle:map_kind py_binary py_binary @aspect_rules_py//py:defs.bzl
# gazelle:map_kind py_test py_test @aspect_rules_py//py:defs.bzl

# This stanza calls a rule that generates targets for managing pip dependencies
# with pip-compile.
compile_pip_requirements(
    name = "requirements",
    src = "requirements.in",
    requirements_txt = "requirements_lock.txt",
)

# This repository rule fetches the metadata for python packages we
# depend on. That data is required for the gazelle_python_manifest
# rule to update our manifest file.
modules_mapping(
    name = "modules_map",
    exclude_patterns = [
        "^_|(\\._)+",  # This is the default.
        "(\\.tests)+",  # Add a custom one to get rid of the psutil tests.
        "^colorama",  # Get rid of colorama on Windows.
        "^tzdata",  # Get rid of tzdata on Windows.
        "^lazy_object_proxy\\.cext$",  # Get rid of this on Linux because it isn't included on Windows.
    ],
    wheels = all_whl_requirements,
)

modules_mapping(
    name = "modules_map_with_types",
    exclude_patterns = [
        "^_|(\\._)+",  # This is the default.
        "(\\.tests)+",  # Add a custom one to get rid of the psutil tests.
        "^colorama",  # Get rid of colorama on Windows.
        "^tzdata",  # Get rid of tzdata on Windows.
        "^lazy_object_proxy\\.cext$",  # Get rid of this on Linux because it isn't included on Windows.
    ],
    include_stub_packages = True,
    modules_mapping_name = "modules_mapping_with_types.json",
    wheels = all_whl_requirements,
)

# Gazelle python extension needs a manifest file mapping from
# an import to the installed package that provides it.
# This macro produces two targets:
# - //python:gazelle_python_manifest.update can be used with `bazel run`
#   to recalculate the manifest
# - //python:gazelle_python_manifest.test is a test target ensuring that
#   the manifest doesn't need to be updated
# This target updates a file called gazelle_python.yaml, and
# requires that file exist before the target is run.
# When you are using gazelle you need to run this target first.
gazelle_python_manifest(
    name = "gazelle_python_manifest",
    modules_mapping = ":modules_map",
    pip_repository_name = "pip",
    tags = ["exclusive"],
)

gazelle_python_manifest(
    name = "gazelle_python_manifest_with_types",
    manifest = "gazelle_python_with_types.yaml",
    modules_mapping = ":modules_map_with_types",
    pip_repository_name = "pip",
    tags = ["exclusive"],
)

# Our gazelle target points to the python gazelle binary.
# This is the simple case where we only need one language supported.
# If you also had proto, go, or other gazelle-supported languages,
# you would also need a gazelle_binary rule.
# See https://github.com/bazelbuild/bazel-gazelle/blob/master/extend.rst#example
# This is the primary gazelle target to run, so that you can update BUILD.bazel files.
# You can execute:
# - bazel run //:gazelle update
# - bazel run //:gazelle fix
# See: https://github.com/bazelbuild/bazel-gazelle#fix-and-update
gazelle(
    name = "gazelle",
    gazelle = "@rules_python_gazelle_plugin//python:gazelle_binary",
)


binary (//python/binary) BUILD.bazel

load("@aspect_rules_py//py:defs.bzl", "py_binary", "py_library", "py_venv", "py_image_layer")
load("@rules_oci//oci:defs.bzl", "oci_image", "oci_image_index", "oci_load", "oci_push")
load("@rules_pkg//:pkg.bzl", "pkg_tar")
load("@pip//:requirements.bzl", "requirement")

# gazelle:resolve py binary //python/binary
# gazelle:resolve py binary.binary //python/binary:binary_lib
# gazelle:resolve py binary.credentials //python/binary:binary_lib
# gazelle:resolve py binary.extractors.base //python/binary:binary_lib

py_library(
    name = "binary_lib",
    srcs = [
        "__init__.py",
        "binary.py",
    ],
    visibility = ["//visibility:public"],
    deps = [
        requirement("google_auth"),
        requirement("google_cloud_secret_manager"),
        requirement("google_cloud_tasks"),
        requirement("langchain"),
        requirement("pillow"),
        requirement("pydantic"),
        requirement("structlog"),
    ],
)

py_binary(
    name = "binary_bin",
    srcs = ["__main__.py"],
    main = "__main__.py",
    visibility = ["//visibility:public"],
    deps = [
        ":binary_lib",
        requirement("python_dotenv"),
    ],
)

py_venv(
    name = "venv",
    deps = [
        ":binary_bin",
        ":binary_lib",
    ],
)

py_image_layer(
    name = "layer_linux_x86_64",
    binary = ":binary_bin",
    platform = "//build/platforms:linux-x86_64",
)

py_image_layer(
    name = "layer_linux_arm64",
    binary = ":binary_bin",
    platform = "//build/platforms:linux-aarch64",
)

oci_image(
    name = "image_linux_amd64",
    base = "@ubuntu_linux_amd64",
    entrypoint = ["/python/binary/binary_bin"],
    tars = [":layer_linux_x86_64"],
)

oci_image(
    name = "image_linux_arm64",
    base = "@ubuntu_linux_arm64_v8",
    entrypoint = ["/python/binary/binary_bin"],
    tars = [":layer_linux_arm64"],
)

oci_image_index(
    name = "image",
    images = [
        ":image_linux_arm64",
        ":image_linux_amd64",
    ],
)

oci_load(
    name = "tarball",
    format = "oci",
    image = ":image",
    repo_tags = ["binary:latest"],
)

filegroup(
    name = "image.tar",
    srcs = [":tarball"],
    output_group = "tarball",
)

oci_push(
    name = "push",
    image = ":image",
    remote_tags = ["latest"],
    repository = "gcr.io/myproject/binary",
)


and MODULE.bazel (relevant section only)

bazel_dep(name = "gazelle", version = "0.41.0")
#
bazel_dep(name = "rules_oci", version = "2.2.0")

oci.pull(
    name = "ubuntu",
    digest = "sha256:80dd3c3b9c6cecb9f1667e9290b3bc61b78c2678c02cbdae5f0fea92cc6734ab",
    image = "ubuntu",
    platforms = [
        "linux/arm64/v8",
        "linux/amd64",
    ],
    tag = "latest",
)
use_repo(oci, "ubuntu", "ubuntu_linux_amd64", "ubuntu_linux_arm64_v8")

# Python toolchain configuration
python = use_extension("@rules_python//python/extensions:python.bzl", "python")
python.toolchain(
    configure_coverage_tool = True,
    is_default = True,
    python_version = "3.11",
)

# pip dependencies management
pip = use_extension("@rules_python//python/extensions:pip.bzl", "pip")
pip.parse(
    hub_name = "pip",
    python_version = "3.11",
    requirements_lock = "//python:requirements_lock.txt",
    experimental_target_platforms = [
        "linux_x86_64",
        "linux_aarch64",
    ],
)
use_repo(pip, "pip")

Any other information?

tried with and without experimental_target_platforms, made no difference

remiphilippe avatar Jan 29 '25 22:01 remiphilippe

If it's a RTFM issue happy to look into it, but so far can't get it to work

remiphilippe avatar Feb 04 '25 17:02 remiphilippe

@remiphilippe oci_image and oci_load are actually producing a tarball on your local machine and the loading it to the docker daemon.

So any C extensions like grpc._cython will have a platform specific suffix and will not be found on the import path when running in Linux. On top of any issues with coupling to Mac ABI or producing dylibs.

You need to run the build on the same machine as what your planning on deploying to or figure out how to work with something like pycross (maybe?). The former is simplest.

Your best bet IMO is using QEMU or similar and a small pet VM on your local you use when you want to operate in a linux environment. OR maybe some weird remote build runner stuff to interop with a local VM? but I personally havent gone that deep. I just resolved that if you have python C extensions in any of your deps or transitives, to build on the right machine

z3z1ma avatar Feb 18 '25 20:02 z3z1ma

You actually need a transition to tell Bazel about the platform that you are targeting. I see you did that with py_image_layer#platform but that does not still solve the issue for pip packages. You need a rules_pycross to get your pip dependencies to work.

thesayyn avatar Apr 25 '25 18:04 thesayyn