uv fails to use extra index url
I'm passing an extra index URL, but uv seems to only find package versions from PyPI.org.
Here is a reproduction:
# Create an experimentation directory.
mkdir repro-uv-extra-urls
cd repro-uv-extra-urls
# Run a local PyPI-like server.
mkdir dists
pipx run pypiserver run dists --disable-fallback -p8000 -a. -P. &>/dev/null &
# Create a pyproject.toml for a project called "ruff", version 1000.
cat <<EOF >pyproject.toml
[project]
name = "ruff"
version = "1000"
description = "Ruff from the future."
authors = [{name = "Charlie Marsh", email = "[email protected]"}]
readme = "README.md"
requires-python = ">=3.8"
classifiers = ["Development Status :: 1 - Planning"]
EOF
# Create a README.md file.
cat <<EOF >README.md
# Ruff
Hello.
EOF
# Build Python distributions for this package.
pipx run --spec build pyproject-build
# Upload both wheel and sdist to our local PyPI-like index.
pipx run twine upload -u "" -p "" --repository-url http://localhost:8000 dist/*
# Assert dists were uploaded.
[ ! -f dists/ruff-1000-py3-none-any.whl ] && echo "Wheel not uploaded"
[ ! -f dists/ruff-1000.tar.gz ] && echo "Source distribution not uploaded"
# Create a venv.
uv venv --seed
# Assert uv fails to install ruff==1000
uv pip install --extra-index-url http://localhost:8000/simple ruff==1000 && echo "Working, not expected" || echo "Failing, as expected"
# Assert pip manages to install ruff==1000
.venv/bin/pip install --extra-index-url http://localhost:8000/simple ruff==1000 && echo "Working, as expected" || echo "Failing, not expected"
My guess is that we're finding ruff on PyPI, so we then know about the Ruff versions from PyPI. But when we fail to find ruff==1000, we don't go back and look for Ruff in any extra indexes.
I actually don't know what the "right" behavior is here. My guess is that we're "supposed" to look at both indexes (though the order in which indexes are searched is not guaranteed in pip).
(Would this explanation match what you saw in practice?)
Yes! Currently, I expect pip to look into every specified index to satisfy the dependency specification (without order/precedence).
Yeah we definitely don't do that right now -- we take the "first match" -- which seems like it might be incorrect.
Thanks for the clear write-up!
I actually don't know what the "right" behavior is here. My guess is that we're "supposed" to look at both indexes (though the order in which indexes are searched is not guaranteed in pip).
The way we use the extra index url it only contains the internal packages that are not available from PyPi. In the case that a package is on both indices, we obviously would like to select the one in the extra index url regardless of version, as otherwise this would break our code, or worse, someone could inject a dependency.
pip does not offer a secure way to provide priority to one of the URLs, indeed the order is not guaranteed. It will look for the highest version in the case of a collision. As a workaround, one can start prefixing the version, e.g. with year 2024.0.1 or renaming the package all together (cumbersome, and no guarantees that it won't be claimed later).
Poetry solves this by being able to set a priority per source:
[[tool.poetry.source]]
name = "pypi"
priority = "primary"
and even a source per package:
httpx = { version = "^0.22", source = "internal-pypi" }
Since security is at stake here, I would hope we can consider to offer this guarantee in some way, or to deviate from the default behaviour of pip.
Edit: +1 for #171
I'll explain my use-case since uv might want to deviate from what pip does (for good reasons) :slightly_smiling_face:
My projects follow a sponsorware strategy, where there's a public version, and a private version with more features. Sponsorships above a certain amount per month grant access to these private repositories on GitHub. To simplify local development, as well as allowing contributors without access to these private versions, I use a local index to store built distributions of these private projects. Details here: https://pawamoy.github.io/pypi-insiders/. It means I can specify some-project in my dependencies, instead of hardcoding git+https://[email protected]/org/private-repo. pip is then able to fetch packages from both PyPI.org and my local index, if configured as such (pip's config file, env var, cli flag, etc.).
Current situation with pip:
- pip (or other package managers) will search for the highest compatible version of the dependency (
some-projectabove) in both indices (PyPI.org and http://localhost:XXXX). - Since the private projects I'm developing (or using: I'm not the only one following this sponsorware strategy) use a versioning scheme like
major.minor.patchfor public versions, andmajor.minor.patch.pmajor.pminor.ppatchwherepmajor.pminor.ppatchis the private version (based on the public one, as a fork), private versions are always "higher" in terms of versioning than the public ones. So pip fetches private versions from the local index, unless there's an even higher public version on PyPI.org without its private equivalent (for example2.0.0on PyPI.org, but only1.2.3.1.0.0and no2.0.0.1.0.0in the local index). - This is documented as a limitation: you cannot enforce usage of a private version if there's a higher public version.
This leads me to this desired situation with uv:
- If uv is capable of enforcing the fetching of distributions from a specific index (when they exist in this index), configurable per user, this will lift the limitation above: I will be able to enforce the use of private versions even if there are higher public versions :+1: Emphasis on per user configuration: I do not want to specify relevant packages, and I do not want to add any configuration in pyproject.toml or other per-project tracked file. A simple
prefer-index = "http://localhost:8000/simple/"in~/.config/uv/conf.toml(totally invented) would be perfect. Thisprefer-indexsetting would be no-op if it is not also passed as an extra-index-url. This way it lets me choose whether to fetch packages from my preferred index, or not when I need to test public versions:- no configured extra index: install from main index
- configured extra index, no configured preferred index: install most compatible version (depending on solving strategy), so install from one or the other indices (look into both)
- configured extra index, configured preferred index: if package exists on preferred index, install most compatible from this index only, otherwise install from main index
Note that in my use-case, there are no security concerns, as there is no concept of internal versus public projects, with the latter being able to shadow the former. Projects are the same in both indices, just with different versions. For use-cases actually involving internal packages, I do understand the need to enforce fetching specific packages from a specific index, and I believe the mentioned prefer-index setting above would solve that too?
Maybe case 2 above shouldn't be supported at all, and then we would just need to reverse the semantic of extra-index-url, without needing any new config option: give extra-index-url precedence over the main index. If you can keep the order of multiple extra-index-urls, then give them precedence according to this order. But if an extra index is not reachable, or does not contain a package, do not fail and fall back to the next one (maybe not secure enough though, as you never want to fall back for internal packages).
Let me know if anything was unclear!
In short:
- I want to specify multiple indices
- I want uv to use them in that order
- If a package isn't found in the first index, I want uv to fall back onto the next index
- Other users might want to prevent falling back to the next index for specific packages
Please consider dependency confusion attacks: https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610
Use of --extra-index-url as they are presently used are a security vulnerability.
PEP 708 is a yet-to-be-implemented approach to improving the security posture.
Of course, failing to install X with no way to resolve the error is pretty awful. So PEP 708 instructs clients to provide a means of users to explicitly configure X to come from a given set of repositories, but does not specify that means because how configuration is handled is a client level decision.
So pip and uv pip should fail when there is ambiguity. Is it already clear how pip will implement repository selection? It seems not: https://github.com/pypa/pip/issues/8606#issuecomment-1776000697
From PEP 708 discussion
I am intentionally mentioning this comment because these 2 issues are related and can lead to significant security problems.
https://github.com/astral-sh/uv/issues/171#issuecomment-1951663263
I don’t want to claim this as a general alternative to “—extra-index-url”, but it does often work for the common scenario of a single package on a different index.
One can use direct url references like so
python -m pip install 'SomeProject@https://my.package.repo/SomeProject-1.2.3-py33-none-any.whl'
Another option I thought about to prevent making the same insecure mistake that pip did, might be to rename the flag to —insecure-extra-index-url so that at the very least the user is warned they may be vulnerable to dependency confusion attacks and should carefully consider the implications of what they are doing.
I've quickly read PEP 708, and I definitely support it instead of ordered indexes. So you can discard my previous comments stating "what I want". With PEP 708 I'll be able to set private projects to "track" the same ones on PyPI.org. This should give me what I want. pypiserver actually has a "fallback" feature which is possibly already solving my use-case needs (though depending on its implementation could cancel the perfs offered by uv, I'll see and report back).
pypiserver by default falls back to PyPI.org when it can't find the specified project within its own distributions. If it finds the package within its own dists, it does not look into PyPI.org. So if I allow it to fall back, and I point uv at it, I get what I wanted: my private, local packages take precedence over packages on PyPI.org, even if more recent versions are on PyPI.org :tada: Perfs are good :slightly_smiling_face:
Tagging latest maintainer @dee-me-tree-or-love in case that's of interest to them :smile:
This feature is important for Cloudsmith private package repositories. In the example below we have a release repository for use in production and a development repository for dev environments.
Thanks for working on this!
pip install \
--extra-index-url=https://dl.cloudsmith.io/$(CLOUDSMITH_RELEASE_SECRET)/acme/release/python/index/ \
--extra-index-url=https://dl.cloudsmith.io/$(CLOUDSMITH_DEVELOPMENT_SECRET)/acme/development/python/index/ \
-r requirements.txt
I've got a similar situation here as well. Torch==2.2.0+cpu requires a special index provided by pytorch and uv is not resolving the package as well.
➜ src git:(main) ✗ UV_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cpu uv -v pip compile --no-build -n requirements.in
uv::requirements::from_source source=requirements.in
uv_interpreter::python_query::find_python selector=Default
0.003066s 0ms DEBUG uv_interpreter::interpreter Detecting markers for: /usr/bin/python3
0.027836s DEBUG uv::commands::pip_compile Using Python 3.10.12 interpreter at /usr/bin/python3 for builds
0.028172s DEBUG uv_client::registry_client Using registry request timeout of 300s
uv_client::flat_index::from_entries
uv_resolver::resolver::solve
0.029227s 0ms DEBUG uv_resolver::resolver Solving with target Python version 3.10.12
uv_resolver::resolver::choose_version package=root
uv_resolver::resolver::get_dependencies package=root, version=0a0.dev0
0.029321s 0ms DEBUG uv_resolver::resolver Adding direct dependency: torch==2.2.0+cpu
0.029335s 0ms DEBUG uv_resolver::resolver Adding direct dependency: requests*
0.029340s 0ms DEBUG uv_resolver::resolver Adding direct dependency: numpy*
0.029344s 0ms DEBUG uv_resolver::resolver Adding direct dependency: pandas*
uv_resolver::resolver::choose_version package=torch
uv_resolver::resolver::package_wait package_name=torch
uv_resolver::resolver::process_request request=Versions torch
uv_client::registry_client::simple_api package=torch
uv_client::cached_client::get_cacheable
uv_client::cached_client::read_and_parse_cache file=/tmp/.tmpGTB7d6/simple-v2/pypi/torch.rkyv
uv_resolver::resolver::process_request request=Versions requests
uv_client::registry_client::simple_api package=requests
uv_client::cached_client::get_cacheable
uv_client::cached_client::read_and_parse_cache file=/tmp/.tmpGTB7d6/simple-v2/pypi/requests.rkyv
uv_resolver::resolver::process_request request=Versions numpy
uv_client::registry_client::simple_api package=numpy
uv_client::cached_client::get_cacheable
uv_client::cached_client::read_and_parse_cache file=/tmp/.tmpGTB7d6/simple-v2/pypi/numpy.rkyv
uv_resolver::resolver::process_request request=Versions pandas
uv_client::registry_client::simple_api package=pandas
uv_client::cached_client::get_cacheable
uv_client::cached_client::read_and_parse_cache file=/tmp/.tmpGTB7d6/simple-v2/pypi/pandas.rkyv
uv_resolver::resolver::process_request request=Prefetch pandas *
uv_resolver::resolver::process_request request=Prefetch numpy *
uv_resolver::resolver::process_request request=Prefetch requests *
uv_resolver::resolver::process_request request=Prefetch torch ==2.2.0+cpu
0.029852s 0ms DEBUG uv_client::cached_client No cache entry for: https://pypi.org/simple/torch/
uv_client::cached_client::fresh_request url="https://pypi.org/simple/torch/"
0.029967s 0ms DEBUG uv_client::cached_client No cache entry for: https://pypi.org/simple/pandas/
uv_client::cached_client::fresh_request url="https://pypi.org/simple/pandas/"
0.030016s 0ms DEBUG uv_client::cached_client No cache entry for: https://pypi.org/simple/requests/
uv_client::cached_client::fresh_request url="https://pypi.org/simple/requests/"
0.030088s 0ms DEBUG uv_client::cached_client No cache entry for: https://pypi.org/simple/numpy/
uv_client::cached_client::fresh_request url="https://pypi.org/simple/numpy/"
uv_client::cached_client::new_cache file=/tmp/.tmpGTB7d6/simple-v2/pypi/torch.rkyv
uv_client::registry_client::parse_simple_api package=torch
uv_client::cached_client::new_cache file=/tmp/.tmpGTB7d6/simple-v2/pypi/pandas.rkyv
uv_client::registry_client::parse_simple_api package=pandas
uv_client::cached_client::new_cache file=/tmp/.tmpGTB7d6/simple-v2/pypi/requests.rkyv
uv_client::registry_client::parse_simple_api package=requests
uv_client::cached_client::new_cache file=/tmp/.tmpGTB7d6/simple-v2/pypi/numpy.rkyv
uv_client::registry_client::parse_simple_api package=numpy
uv_resolver::version_map::from_metadata
uv_resolver::version_map::from_metadata
uv_distribution::distribution_database::get_or_build_wheel_metadata dist=requests==2.31.0
uv_client::registry_client::wheel_metadata built_dist=requests==2.31.0
uv_resolver::version_map::from_metadata
uv_client::cached_client::get_serde
uv_client::cached_client::get_cacheable
uv_client::cached_client::read_and_parse_cache file=/tmp/.tmpGTB7d6/wheels-v0/pypi/requests/requests-2.31.0-py3-none-any.msgpack
0.065767s 36ms DEBUG uv_resolver::resolver Searching for a compatible version of torch (==2.2.0+cpu)
0.065781s 36ms DEBUG uv_resolver::resolver No compatible version found for: torch
× No solution found when resolving dependencies:
╰─▶ Because there is no version of torch==2.2.0+cpu and you require torch==2.2.0+cpu, we can conclude that the requirements are unsatisfiable.