rules_py
rules_py copied to clipboard
[Bug]: py_image_layer seems to embed the host version of the libraries in cross built container
What happened?
When trying to run a container which is build on Mac OS (Intel), the container will fail on a grpc dependency:
root@3e4086586e34:/# /python/binary/binary_bin
Traceback (most recent call last):
File "/python/binary/binary_bin.runfiles/_main/python/binary/__main__.py", line 7, in <module>
from python.binary.credentials import load_credentials
File "/python/binary/binary_bin.runfiles/_main/python/binary/credentials.py", line 3, in <module>
from google.cloud import secretmanager
File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/google/cloud/secretmanager/__init__.py", line 21, in <module>
from google.cloud.secretmanager_v1.services.secret_manager_service.async_client import (
File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/google/cloud/secretmanager_v1/__init__.py", line 21, in <module>
from .services.secret_manager_service import (
File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/google/cloud/secretmanager_v1/services/secret_manager_service/__init__.py", line 16, in <module>
from .async_client import SecretManagerServiceAsyncClient
File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/google/cloud/secretmanager_v1/services/secret_manager_service/async_client.py", line 33, in <module>
from google.api_core import gapic_v1
File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/google/api_core/gapic_v1/__init__.py", line 16, in <module>
from google.api_core.gapic_v1 import config
File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/google/api_core/gapic_v1/config.py", line 23, in <module>
import grpc
File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/grpc/__init__.py", line 22, in <module>
from grpc import _compression
File "/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/grpc/_compression.py", line 20, in <module>
from grpc._cython import cygrpc
ImportError: cannot import name 'cygrpc' from 'grpc._cython' (/python/binary/binary_bin.runfiles/.binary_bin.venv/lib/python3.11/site-packages/grpc/_cython/__init__.py)
The problem is that the dependency is loaded with the host libs, instead of linux amd64:
root@3e4086586e34:/# ls /python/binary/binary_bin.runfiles/.crawler_bin.venv/lib/python3.11/site-packages/grpc/_cython/
__init__.py _credentials _cygrpc cygrpc.cpython-311-darwin.so
Version
Development (host) and target OS/architectures: MacOS x64 (
Output of bazel --version:
bazel 7.4.1-homebrew
Version of the Aspect rules, or other relevant rules from your
WORKSPACE or MODULE.bazel file:
bazel_dep(name = "rules_python", version = "1.1.0", dev_dependency = True)
bazel_dep(name = "rules_python_gazelle_plugin", version = "1.1.0", dev_dependency = True)
bazel_dep(name = "aspect_rules_py", version = "1.1.0")
Language(s) and/or frameworks involved: Python using google cloud apis
How to reproduce
parent (//python) BUILD.bazel
load("@gazelle//:def.bzl", "gazelle")
load("@pip//:requirements.bzl", "all_whl_requirements")
load("@rules_python//python:pip.bzl", "compile_pip_requirements")
load("@rules_python//python:py_binary.bzl", "py_binary")
load("@rules_python//python:py_library.bzl", "py_library")
load("@rules_python//python:py_test.bzl", "py_test")
load("@rules_python_gazelle_plugin//manifest:defs.bzl", "gazelle_python_manifest")
load("@rules_python_gazelle_plugin//modules_mapping:def.bzl", "modules_mapping")
# gazelle:map_kind py_library py_library @aspect_rules_py//py:defs.bzl
# gazelle:map_kind py_binary py_binary @aspect_rules_py//py:defs.bzl
# gazelle:map_kind py_test py_test @aspect_rules_py//py:defs.bzl
# This stanza calls a rule that generates targets for managing pip dependencies
# with pip-compile.
compile_pip_requirements(
name = "requirements",
src = "requirements.in",
requirements_txt = "requirements_lock.txt",
)
# This repository rule fetches the metadata for python packages we
# depend on. That data is required for the gazelle_python_manifest
# rule to update our manifest file.
modules_mapping(
name = "modules_map",
exclude_patterns = [
"^_|(\\._)+", # This is the default.
"(\\.tests)+", # Add a custom one to get rid of the psutil tests.
"^colorama", # Get rid of colorama on Windows.
"^tzdata", # Get rid of tzdata on Windows.
"^lazy_object_proxy\\.cext$", # Get rid of this on Linux because it isn't included on Windows.
],
wheels = all_whl_requirements,
)
modules_mapping(
name = "modules_map_with_types",
exclude_patterns = [
"^_|(\\._)+", # This is the default.
"(\\.tests)+", # Add a custom one to get rid of the psutil tests.
"^colorama", # Get rid of colorama on Windows.
"^tzdata", # Get rid of tzdata on Windows.
"^lazy_object_proxy\\.cext$", # Get rid of this on Linux because it isn't included on Windows.
],
include_stub_packages = True,
modules_mapping_name = "modules_mapping_with_types.json",
wheels = all_whl_requirements,
)
# Gazelle python extension needs a manifest file mapping from
# an import to the installed package that provides it.
# This macro produces two targets:
# - //python:gazelle_python_manifest.update can be used with `bazel run`
# to recalculate the manifest
# - //python:gazelle_python_manifest.test is a test target ensuring that
# the manifest doesn't need to be updated
# This target updates a file called gazelle_python.yaml, and
# requires that file exist before the target is run.
# When you are using gazelle you need to run this target first.
gazelle_python_manifest(
name = "gazelle_python_manifest",
modules_mapping = ":modules_map",
pip_repository_name = "pip",
tags = ["exclusive"],
)
gazelle_python_manifest(
name = "gazelle_python_manifest_with_types",
manifest = "gazelle_python_with_types.yaml",
modules_mapping = ":modules_map_with_types",
pip_repository_name = "pip",
tags = ["exclusive"],
)
# Our gazelle target points to the python gazelle binary.
# This is the simple case where we only need one language supported.
# If you also had proto, go, or other gazelle-supported languages,
# you would also need a gazelle_binary rule.
# See https://github.com/bazelbuild/bazel-gazelle/blob/master/extend.rst#example
# This is the primary gazelle target to run, so that you can update BUILD.bazel files.
# You can execute:
# - bazel run //:gazelle update
# - bazel run //:gazelle fix
# See: https://github.com/bazelbuild/bazel-gazelle#fix-and-update
gazelle(
name = "gazelle",
gazelle = "@rules_python_gazelle_plugin//python:gazelle_binary",
)
binary (//python/binary) BUILD.bazel
load("@aspect_rules_py//py:defs.bzl", "py_binary", "py_library", "py_venv", "py_image_layer")
load("@rules_oci//oci:defs.bzl", "oci_image", "oci_image_index", "oci_load", "oci_push")
load("@rules_pkg//:pkg.bzl", "pkg_tar")
load("@pip//:requirements.bzl", "requirement")
# gazelle:resolve py binary //python/binary
# gazelle:resolve py binary.binary //python/binary:binary_lib
# gazelle:resolve py binary.credentials //python/binary:binary_lib
# gazelle:resolve py binary.extractors.base //python/binary:binary_lib
py_library(
name = "binary_lib",
srcs = [
"__init__.py",
"binary.py",
],
visibility = ["//visibility:public"],
deps = [
requirement("google_auth"),
requirement("google_cloud_secret_manager"),
requirement("google_cloud_tasks"),
requirement("langchain"),
requirement("pillow"),
requirement("pydantic"),
requirement("structlog"),
],
)
py_binary(
name = "binary_bin",
srcs = ["__main__.py"],
main = "__main__.py",
visibility = ["//visibility:public"],
deps = [
":binary_lib",
requirement("python_dotenv"),
],
)
py_venv(
name = "venv",
deps = [
":binary_bin",
":binary_lib",
],
)
py_image_layer(
name = "layer_linux_x86_64",
binary = ":binary_bin",
platform = "//build/platforms:linux-x86_64",
)
py_image_layer(
name = "layer_linux_arm64",
binary = ":binary_bin",
platform = "//build/platforms:linux-aarch64",
)
oci_image(
name = "image_linux_amd64",
base = "@ubuntu_linux_amd64",
entrypoint = ["/python/binary/binary_bin"],
tars = [":layer_linux_x86_64"],
)
oci_image(
name = "image_linux_arm64",
base = "@ubuntu_linux_arm64_v8",
entrypoint = ["/python/binary/binary_bin"],
tars = [":layer_linux_arm64"],
)
oci_image_index(
name = "image",
images = [
":image_linux_arm64",
":image_linux_amd64",
],
)
oci_load(
name = "tarball",
format = "oci",
image = ":image",
repo_tags = ["binary:latest"],
)
filegroup(
name = "image.tar",
srcs = [":tarball"],
output_group = "tarball",
)
oci_push(
name = "push",
image = ":image",
remote_tags = ["latest"],
repository = "gcr.io/myproject/binary",
)
and MODULE.bazel (relevant section only)
bazel_dep(name = "gazelle", version = "0.41.0")
#
bazel_dep(name = "rules_oci", version = "2.2.0")
oci.pull(
name = "ubuntu",
digest = "sha256:80dd3c3b9c6cecb9f1667e9290b3bc61b78c2678c02cbdae5f0fea92cc6734ab",
image = "ubuntu",
platforms = [
"linux/arm64/v8",
"linux/amd64",
],
tag = "latest",
)
use_repo(oci, "ubuntu", "ubuntu_linux_amd64", "ubuntu_linux_arm64_v8")
# Python toolchain configuration
python = use_extension("@rules_python//python/extensions:python.bzl", "python")
python.toolchain(
configure_coverage_tool = True,
is_default = True,
python_version = "3.11",
)
# pip dependencies management
pip = use_extension("@rules_python//python/extensions:pip.bzl", "pip")
pip.parse(
hub_name = "pip",
python_version = "3.11",
requirements_lock = "//python:requirements_lock.txt",
experimental_target_platforms = [
"linux_x86_64",
"linux_aarch64",
],
)
use_repo(pip, "pip")
Any other information?
tried with and without experimental_target_platforms, made no difference
If it's a RTFM issue happy to look into it, but so far can't get it to work
@remiphilippe oci_image and oci_load are actually producing a tarball on your local machine and the loading it to the docker daemon.
So any C extensions like grpc._cython will have a platform specific suffix and will not be found on the import path when running in Linux. On top of any issues with coupling to Mac ABI or producing dylibs.
You need to run the build on the same machine as what your planning on deploying to or figure out how to work with something like pycross (maybe?). The former is simplest.
Your best bet IMO is using QEMU or similar and a small pet VM on your local you use when you want to operate in a linux environment. OR maybe some weird remote build runner stuff to interop with a local VM? but I personally havent gone that deep. I just resolved that if you have python C extensions in any of your deps or transitives, to build on the right machine
You actually need a transition to tell Bazel about the platform that you are targeting. I see you did that with py_image_layer#platform but that does not still solve the issue for pip packages. You need a rules_pycross to get your pip dependencies to work.