rules_python
rules_python copied to clipboard
[Proposal] Supporting custom index urls
🚀 feature request
Relevant Rules
Existing rules:
-
pip_install
-
pip_parse
Description
RE: https://github.com/bazelbuild/rules_python/issues/74, but only covers customizing the index urls.
We'd like to see support for customizing the index urls used to locate packages through the rule attributes. We're willing to contribute the PR for this, but we'd like to see community agreement on the interface and whether such a change would be accepted.
The problem is that pip will only search the default index for packages unless one of the workarounds below is employed. Many enterprises depend on a private index to distribute internal libraries, meaning that these enterprises are left to use one of the workarounds, which is difficult from a dev-ops perspective to instruct projects to use. Having a documented and prescribed method to configure this makes self-discovery possible and makes macros easier to write that preset the index urls.
This does not cover authentication because the bazel rules I have seen so far have centered their authentication around the .netrc file, which I think is a good idea. pip
already supports authenticating from a .netrc file and since rules_python delegates package discovery/installation to pip
, this works out great.
Describe the solution you'd like
Rules that interact with an index should accept urls to use. If not specified, the default behavior will mimic pip
and use https://pypi.org/simple
. If specified, it will be absolute, which means users will need to include the default index if they want to add their own. This will permit not using the public index at all, which is sometimes desirable if you want dependency requests to go through a virtual index.
Proposed interface to set the index urls:
pip_install(
...
index_urls = [
"https://private.domain/artifactory/my-repository",
# This could be exported as a constant
"https://pypi.org/simple",
]
)
pip_parse(
index_urls = [
"https://private.domain/artifactory/my-repository",
"https://pypi.org/simple",
]
)
Describe alternatives you've considered
Alternative 1 Consider the following:
index_urls = [],
extra_index_urls = [
"https://private.domain/artifactory/my-repository",
]
This maps closer to the pip
options, however the difference between the two options when looking at the rule form is less clear, especially to someone who doesn't already use pip
.
Additionally, running queries becomes more complex because you'd now have to query two attributes to find all possible dependency sources, which is a legitimate use case for our security team.
Workaround 1
One way of installing dependencies from a private index is by re-using an existing configuration in your ~/.pip/pip.conf
file. However, this requires every bazel command gets invoked with --action_env=PIP_CONFIG_FILE=$HOME/.pip/pip.conf
. This is not only burdensome to remember, but it also is incompatible with a remote build cache. The unique environment variable makes sharing a build cache impossible.
This workaround also hides the index urls being used from the configuration, which is not hermetic.
Workaround 2
Another way to install dependencies from a private index today, is by including --extra-index-url https://my.domain/pypi-local/simple
in the requirements.in (or .txt). This has better ergonomics than workaround 1, but still suffers from not being queryable like it is in rules_jvm_external
. It also can't automatically be supplied via macro if a company wanted to wrap the rules_python rules to inject company-wide defaults.
I love this proposal. In my environment all artifacts must be scanned and vetted before deployment to production. Being able to specify where our Python projects pull it's external dependencies from gives me and my security team the ability to leverage the tools & automation from our repository applications in a more fluid way.
I personally like the initial solution example. If the index_urls
attr were left blank have it default to "pypi.org". Otherwise, only query the sources specified within index_urls
:
pip_install(
...
index_urls = [
"https://private.domain/artifactory/my-repository",
# This could be exported as a constant
"https://pypi.org/simple",
]
)
Preciso muito disso.
Did you try using extra_pip_args
to provide the custom parameters for pip? It seems to be working well
https://github.com/bazelbuild/rules_python/blob/main/examples/pip_install/WORKSPACE#L16
pip_install(
...
extra_pip_args = [
"--index-url", "<your-custom-urls>",
"--extra-index-url", "<your-extra-index-urls>",
],
)
Also could you please explain how could Workaround 1 work? As in the latest version pip is running with --isolated which would ignore those variables.
This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!
This issue is nearly all I need. What I can't figure out is how to securely pass user authentication into the index url?
For example:
pip_install(
...
extra_pip_args = [
"--index-url", "https://{}:{}@my.pypi.repo/simple".format(username, password),
],
)
Usually I'd like to do this with an environment variable. I know I can write a .bzl
file repository rule to export the value... but I can't figure out how to access the string in the WORKSPACE file?
Any ideas? Help very much appreciated.
This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!
This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"