bazel icon indicating copy to clipboard operation
bazel copied to clipboard

Lots of repositories cause "Argument list too long" for py_binary

Open pauldraper opened this issue 3 years ago • 2 comments

Problem

The Python wrapper is finding every repository in the runfiles tree, creating a huge PYTHONPATH variable, and then crashing when it exceeds the system's env var limits.

I encountered this on Ubuntu 20.04, Python 3.8.

OSError: [Errno 7] Argument list too long: '/root/.cache/bazel/_bazel_root/70833c7af31404a35dcc581d457bd519/execroot/rivethealth_rivet/bazel-out/k8-fastbuild/bin/api/test_it/bin.runfiles/bazel_tools/tools/python/py3wrapper.sh'

A large number (hundreds) of external repositories is quite possible with Maven and npm projects. In my case, I have utility tools written in Python used in the runtime tree of a Node.js binary (that pulls in npm repositories).

Solution

Ideally, py_binary would only add Python dependencies to the PYTHONPATH. Besides prevent a crash, it's also more efficient for loading modules.

Workarounds

import_all

An undocumented option --experimental_python_import_all_repositories=false prevents automatically adding these repositories.

However, it requires specifying the imports attribute, at least in some executions contexts, and of course the general Bazel ecosystem (e.g. rules_docker) doesn't do that.

usercustomize

Use usercustomize to change PYTHONPATH when execv is run.

~/.local/lib/python3.8/site-packages/usercustomize.py

import os

execv = os.execv

def _execv(*args, **kwargs):
    try:
        pythonpath = os.environ["PYTHONPATH"]
    except KeyError:
        pass
    # remove unnecessary entries from pythonpath
    # ...
    os.environ["PYTHONPATH"] = pythonpath
    return execv(*args, **kwargs)

os.execv = _execv

pauldraper avatar Jan 25 '22 17:01 pauldraper

This can also happen when the $TMPDIR location is long so that the absolute paths to the import locations are long. Even with only python repos on the PYTHONPATH you can exceed this limit.

hrfuller avatar Feb 07 '22 19:02 hrfuller

This can also happen when you are using the py_test rule when using hermetic Python for the same reason as in the previous post.

jbedorf avatar Dec 20 '22 13:12 jbedorf

So what would be solution for this?

arunkant avatar Mar 11 '23 08:03 arunkant

The long term solution is to use the site packages import mechanism for external repos instead of PYTHONPATH.

hrfuller avatar Mar 13 '23 16:03 hrfuller

So what would be solution for this?

@arunkant I included two workarounds in my bug report: https://github.com/bazelbuild/bazel/issues/14640#issue-1114181391

The first is better than the second, unless you are using third-party Python targets, e.g. in rules_docker.

pauldraper avatar Mar 13 '23 19:03 pauldraper

Any workaround for rules_docker?

arunkant avatar Mar 13 '23 19:03 arunkant

The long term solution is to use the site packages import mechanism for external repos instead of PYTHONPATH.

How to do that? @hrfuller any docs or config I can use. We are migrating our codebase to monorepo and consolidate all python deps at one location and I don't want all that effort to be blocked by this issue

arunkant avatar Mar 13 '23 19:03 arunkant

@arunkant in that case, see my second workaround: use usercustomize.py. Monkeypatch os.execv to remove unwanted entries from environ["PYTHONPATH"].

pauldraper avatar Mar 13 '23 19:03 pauldraper

Thanks @pauldraper. That looks reasonable workaround. I'll try it

arunkant avatar Mar 13 '23 20:03 arunkant

I hit this issue recently which took a while to debug. Any plan to have it fixed?

In fact if the py_binary target relative path to the root of the repo is lengthy then it's easier to hit this limit because each entry of PYTHONPATH becomes longer

amir-f avatar Sep 13 '23 07:09 amir-f

The way to fix it is to have everyone make their rules compatible with --noexperimental_python_import_all_repositories

pauldraper avatar Sep 13 '23 19:09 pauldraper

Any workaround for rules_docker?

@arunkant rules_docker merged this which fixes --noexperimental_python_import_all_repositories https://github.com/bazelbuild/rules_docker/pull/2171

AugustKarlstedt avatar Sep 16 '23 01:09 AugustKarlstedt

@arunkant rules_docker merged this which fixes --noexperimental_python_import_all_repositories bazelbuild/rules_docker#2171

If you're running the latest rules_docker release (v0.25.0 from June 2022), you'll have to patch it into your WORKSPACE:

http_archive(
    name = "io_bazel_rules_docker",
    patch_args = ["-p1"],
    patches = [
        "//build/patch:github.com_bazelbuild_rules_docker_pull_2171.patch",
    ],
    sha256 = "b1e80761a8a8243d03ebca8845e9cc1ba6c82ce7c5179ce2b295cd36f7e394bf",
    urls = ["https://github.com/bazelbuild/rules_docker/releases/download/v0.25.0/rules_docker-v0.25.0.tar.gz"],
)

You can grab the patch file with:

% gh pr diff https://github.com/bazelbuild/rules_docker/pull/2171 > github.com_bazelbuild_rules_docker_pull_2171.patch

lazcamus avatar Dec 15 '23 09:12 lazcamus