rules_python icon indicating copy to clipboard operation
rules_python copied to clipboard

pip_install_dependencies doesn't work with private package index repositories => Error downloading from files.pythonhosted.org

Open georgevreilly-stripe opened this issue 1 year ago • 3 comments

🐞 bug report

Affected Rule

The issue is caused by the macro: pip_install_dependencies

Is this a regression?

Presumably not.

Description

On a clean system that does not have egress to the Internet, I get Error downloading [https://files.pythonhosted.org/packages/58/91/17b00d5fac63d3dca605f1b8269ba3c65e98059e1fd99d00283e42a454f0/build-0.10.0-py3-none-any.whl]

I have been able to work around this by patching rules_python in my WORKSPACE.bazel:

http_archive(
    name = "rules_python",
    patch_args = ["-p1"],
    patches = ["//third_party:pythonhosted.patch"],
    sha256 = "e85ae30de33625a63eca7fc40a94fea845e641888e52f32b6beea91e8b1b2793",
    strip_prefix = "rules_python-0.27.1",
    url = "https://github.com/bazelbuild/rules_python/releases/download/0.27.1/rules_python-0.27.1.tar.gz",
)

pythonhosted.patch adjusts the URLs in _RULE_DEPS to use our private Python package index hosted on an internal instance of Artifactory.

This is related to https://github.com/bazelbuild/rules_python/issues/798

🔬 Minimal Reproduction

Without my patch, I see:


georgevreilly@internal:/pay/src/torch21$ ~/bin/bazel clean --expunge
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.

georgevreilly@internal:/pay/src/torch21$ ~/bin/bazel test --repository_cache='' //...
Starting local Bazel server and connecting to it...
INFO: Repository pypi__build instantiated at:
  /pay/src/torch21/WORKSPACE.bazel:12:16: in 
  /pay/home/georgevreilly/.cache/bazel/_bazel_georgevreilly/b060158845e808ff2a9c2fcf0dcfee37/external/rules_python/python/repositories.bzl:63:29: in py_repositories
  /pay/home/georgevreilly/.cache/bazel/_bazel_georgevreilly/b060158845e808ff2a9c2fcf0dcfee37/external/rules_python/python/pip_install/repositories.bzl:139:14: in pip_install_dependencies
  /pay/home/georgevreilly/.cache/bazel/_bazel_georgevreilly/b060158845e808ff2a9c2fcf0dcfee37/external/bazel_tools/tools/build_defs/repo/utils.bzl:233:18: in maybe
Repository rule http_archive defined at:
  /pay/home/georgevreilly/.cache/bazel/_bazel_georgevreilly/b060158845e808ff2a9c2fcf0dcfee37/external/bazel_tools/tools/build_defs/repo/http.bzl:372:31: in 
WARNING: Download from https://files.pythonhosted.org/packages/58/91/17b00d5fac63d3dca605f1b8269ba3c65e98059e1fd99d00283e42a454f0/build-0.10.0-py3-none-any.whl failed: class java.io.IOException Unable to tunnel through proxy. Proxy returns "HTTP/1.1 407 Request rejected by proxy"
ERROR: An error occurred during the fetch of repository 'pypi__build':
   Traceback (most recent call last):
        File "/pay/home/georgevreilly/.cache/bazel/_bazel_georgevreilly/b060158845e808ff2a9c2fcf0dcfee37/external/bazel_tools/tools/build_defs/repo/http.bzl", line 132, column 45, in _http_archive_impl
                download_info = ctx.download_and_extract(
Error in download_and_extract: java.io.IOException: Error downloading [https://files.pythonhosted.org/packages/58/91/17b00d5fac63d3dca605f1b8269ba3c65e98059e1fd99d00283e42a454f0/build-0.10.0-py3-none-any.whl] to /pay/home/georgevreilly/.cache/bazel/_bazel_georgevreilly/b060158845e808ff2a9c2fcf0dcfee37/external/pypi__build/temp14118944566413978481/build-0.10.0-py3-none-any.whl.zip: Unable to tunnel through proxy. Proxy returns "HTTP/1.1 407 Request rejected by proxy"
ERROR: /pay/src/torch21/WORKSPACE.bazel:12:16: fetching http_archive rule //external:pypi__build: Traceback (most recent call last):
        File "/pay/home/georgevreilly/.cache/bazel/_bazel_georgevreilly/b060158845e808ff2a9c2fcf0dcfee37/external/bazel_tools/tools/build_defs/repo/http.bzl", line 132, column 45, in _http_archive_impl
                download_info = ctx.download_and_extract(
Error in download_and_extract: java.io.IOException: Error downloading [https://files.pythonhosted.org/packages/58/91/17b00d5fac63d3dca605f1b8269ba3c65e98059e1fd99d00283e42a454f0/build-0.10.0-py3-none-any.whl] to /pay/home/georgevreilly/.cache/bazel/_bazel_georgevreilly/b060158845e808ff2a9c2fcf0dcfee37/external/pypi__build/temp14118944566413978481/build-0.10.0-py3-none-any.whl.zip: Unable to tunnel through proxy. Proxy returns "HTTP/1.1 407 Request rejected by proxy"
ERROR: /pay/home/georgevreilly/.cache/bazel/_bazel_georgevreilly/b060158845e808ff2a9c2fcf0dcfee37/external/python_deps/torch/BUILD.bazel:8:6: no such package '@python_deps_torch//': no such package '@pypi__build//': java.io.IOException: Error downloading [https://files.pythonhosted.org/packages/58/91/17b00d5fac63d3dca605f1b8269ba3c65e98059e1fd99d00283e42a454f0/build-0.10.0-py3-none-any.whl] to /pay/home/georgevreilly/.cache/bazel/_bazel_georgevreilly/b060158845e808ff2a9c2fcf0dcfee37/external/pypi__build/temp14118944566413978481/build-0.10.0-py3-none-any.whl.zip: Unable to tunnel through proxy. Proxy returns "HTTP/1.1 407 Request rejected by proxy" and referenced by '@python_deps//torch:pkg'
ERROR: Analysis of target '//calculator:calc_test' failed; build aborted:
INFO: Elapsed time: 21.944s
INFO: 0 processes.
ERROR: Couldn't start the build. Unable to run tests
FAILED: Build did NOT complete successfully (44 packages loaded, 2599 targets configured)
    Fetching repository @python_deps_torch; Restarting. 12s

🌍 Your Environment

Operating System:

  
Ubuntu 20.04
  

Output of bazel version:

  
Build label: 6.3.2
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Tue Aug 8 15:48:33 2023 (1691509713)
Build timestamp: 1691509713
Build timestamp as int: 1691509713
  

Rules_python version:

  
0.27.1
  

Anything else relevant?

georgevreilly-stripe avatar Jan 08 '24 18:01 georgevreilly-stripe

Another method to override the URLs to the internal rules_python dependencies would be through the use of the bazel downloader, see https://blog.aspect.dev/configuring-bazels-downloader

This way is much more robust and sustainable anddoes not require rule authors to add extra arguments.

Let me know if that works for you.

aignas avatar Jan 08 '24 21:01 aignas

@aignas The bazel downloader configuration looks very good, but I assume it does not work for Python requirements, since they are downloaded through pip and not through the repository_ctx.download[_and_extract] function. Am I mistaken?

So what would be the most efficient way for Python requirements to solve this issue? I am investigating injecting a locally created cache that could be configured through a -f option added as extra_pip_args to the install_deps function. Is there a smarter way?

kilian-funk avatar Feb 16 '24 00:02 kilian-funk

#1827 got merged and I think it should work now. Please test and let me know what breaks, this will help enable the feature by default.

aignas avatar Apr 17 '24 08:04 aignas

The downloader setup is present, so everything should be working, please let me know if that is not the case.

aignas avatar Jun 01 '24 05:06 aignas