Add `--only-deps` (and `--only-build-deps`) option(s)
https://github.com/pypa/pip/issues/11440#issuecomment-1445119899 is the currently agreed upon user-facing design for this feature.
What's the problem this feature will solve?
In #8049, we identified an use case for installing just the dependencies from pyproject.toml.
As described in the solution section below --only-deps=<spec> would determine all dependencies of <spec> excluding that package itself and install those without installing the package. It could be used to
- allow specifying environment variables that are active only while building a package of interest (without having it be active while potentially building its dependencies).
- separate dependency installation from building and installing a package, allowing to rebuild a package in a docker build while the dependency installation step is loaded from cache.
This example shows both use cases:
# copy project metadata and install (only) dependencies
COPY pyproject.toml /myproj/
WORKDIR /myproj/
RUN pip install --extra-index-url="$PIP_INDEX" --only-deps=.[floob]
# copy project source files, build in a controlled environment and install our package
COPY src/mypkg/ /myproj/src/mypkg/
RUN env SETUPTOOLS_SCM_PRETEND_VERSION=2.0.2 python3 -m build --no-isolation --wheel
RUN pip install --no-cache-dir --no-dependencies dist/*.whl
Instead of the solution from #8049, @pradyunsg prefers a solution similar to the one below: https://github.com/pypa/pip/issues/8049#issuecomment-1079882786
Describe the solution you'd like
One of those two, or similar:
-
(used in the example above)
--only-depswould work like-rin that it’s not a flag globally modifying pip’s behavior but a CLI option with one argument that can be specified multiple times. Unlike-rit accepts a dependency spec and not a path to a file containing dependency specs.Where
pip install <spec>first installs all dependencies and then (build and) install the package referred to by the spec itself,pip install --only-deps=<spec>would only install the dependencies. -
--only-depswould work like--[no|only]-binary, in that it requires an argument specifying what package not to install. A placeholder like:requested:could be used, e.g.:pip install --only-deps=:requested: .[floob]
Alternative Solutions
-
Re-using
-rinstead of adding--only-deps.I don’t think this is a good idea, since people would be tempted to do
-r pyproject.tomlwhich would be wrong (Dependency specs including file paths look like like./path/to/pkg[extra1,extra2]) -
Making
--only-depsa global switch modifying pip’s behavior like e.g.--pre.I have found that global switches like that are dangerous and not very intuitive. To install a dev version of your package, doing
pip install --pre mypkgseems innocuous but will actually install dev versions ofmypkgand all its dependencies that have any dev versions. It’s safer to do something likepip install mypkg>=0.1.post0.dev0to limit dev version installations to one package. Similarly it’s unclear what a--only-depsswitch would apply to. Wouldpip install -r reqs.txt --only-depsinstall the dependencies of every package specified in the file but none of those packages? -
Using e.g. beni to convert PEP 621 dependencies to a requirements.txt.
This works even today but feels like is shouldn’t be necessary as it involves quite a few steps, including writing a file to disk.
Additional context
NA
Code of Conduct
- [X] I agree to follow the PSF Code of Conduct.
Thanks for filing this @flying-sheep!
I wonder if it would be better for --only-deps to mirror how --no-deps behaves.
If I interpret the spartan docs for --no-deps correctly, it doesn’t take an argument. I added motivation to “Alternative Solutions” on why I think --only-deps should accept an argument. Do you disagree? If yes: why? If no, (and you didn’t change your mind) it seems like I didn’t understand what you mean: in what way should --only-deps work like --no-deps?
There’s actually yet another possibility: make --only-deps a global switch that takes one single value, like --no-binary and --only-binary.
I think I personally like --only-deps=<names> best, followed by --only-deps=<spec> (proposed in this PR), and a value-less --only-deps last (and we really should find a way to make --no-deps work the same as --no-binary).
I see, those options are modifiers that pick out individual packages from the flattened dependency list and modify pip’s behavior towards that ones. So one would do:
cd mypkg # project dir
pip install --only-deps=mypkg .[floob]
I think it makes sense regarding consistency with --[no|only]-binary, but isn’t 100% practical, as the only use case that came up so far is the one above, so users will always have to specify both dist name and relative path to the project.
It may make sense to create wild card names e.g. :requested: to simplify the UX somewhat. --no-binary etc. have :all:, which of course does not make sense for --only-deps, but can be a good inspiration.
This would be extremely useful to prepare lambda layers, or any kind of "pre-provided" environment, whilst keeping the exact requirements (including locks) properly versioned in a git repository. The target environment could then be replicated with ease, e.g. when developing locally, or when testing.
Follows an example pyproject.toml:
[build-system]
requires = [
"setuptools >= 45",
"wheel",
]
build-backend = "setuptools.build_meta"
[project]
name = "my-lambda"
requires-python = ">= 3.7"
version = "0.1.0"
# Again, this is just an example!
[project.optional-dependencies]
provided = [
"typing-extensions >= 4",
"requests ~= 2.23.0",
"requests_aws4auth ~= 0.9",
"boto3 ~= 1.13.14",
"certifi >= 2020.4.5.1",
"elasticsearch ~= 7.7.0",
"elasticsearch_dsl ~= 7.2.0",
"aws_requests_auth ~= 0.4.2",
]
pre-commit = [
'nox >= 2022.1',
'pytest >= 7.1.2',
'black[d] >= 22',
'mypy >= 0.950',
'pre-commit >= 2.17.0',
'flake8 >= 4; python_version >= "3.8"',
'flake8 < 4; python_version < "3.8"',
'pydocstyle[toml] >= 6.1.1',
'isort >= 5.10.1',
]
Then, when creating a new "provided" environment (e.g. a lambda layer):
# Must be run in a similar environment as the target one.
# Advantage of this over `pip download` is the ability
# of mixing source and binary distributions, whenever
# necessary (e.g. downloading both numpy and pyspark).
# Could also add locks/pinning, via `--constraint`.
mkdir -p dist/python
pip3 install \
.[provided] \
--target dist/python \
--only-deps=:requested:
( cd dist && zip ../dist.provided.zip ./python )
And in a development or CI-like environment:
# May be cached.
python3 -m venv venv
source ./venv/bin/activate
# Gets all development tools, and anything used to run
# automated tasks.
# Could also add locks/pinning, via `--constraint`.
./venv/in/pip3 install -e .[provided,pre-commit]
@uranusjr Sure, I’m not married to the semantics I suggested. I’m fine with your design. Now we just need someone to implement it lol.
Would it be reasonable to have --only-deps work only when exactly one top level requirement is provided, and fail otherwise?
Would it be reasonable to have
--only-depswork only when exactly one top level requirement is provided, and fail otherwise?
Yes.
I'm not sure I understand the :requested: and trying to pick out individual packages. My $2c here is that there are multiple keys in pyproject.toml, and those keys are the only things to be selecting on. The keys are: dependencies, requires (under [build-system]), and then optional-dependencies which may have multiple entries. It'd be great to spell out how all those get installed. Something like:
pip install . --only-deps dependencies
pip install . --only-deps requires
pip install . --only-deps doc # for a "doc" key under optional-dependencies
And the two comments above say that if you need two of those, you need multiple invocations of pip rather than providing two key names at once?
pip install . --only-deps doc
IMO, pip install --only-deps .[doc] has clearer syntax, especially when not using a PEP 621 style pyproject.toml-based dependency declaration.
None the less, I agree that we don't need the additional complexity here.
Okay, so:
pip install --only-deps . # dependencies
pip install --only-deps .[doc] # optional-dependencies `doc`
And just to confirm, pip install . --only-deps .[requires] for the build requirements (meaning it's the optional-dependencies syntax, but requires is reserved and people shouldn't use a requires key under optional-dependencies)?
I'd suggest a small variation, to avoid overloading the [extras] syntax.
pip install --only-deps . # project.dependencies
pip install --only-deps .[doc] # project.dependencies and project.optional-dependencies.doc
pip install --only-build-deps . # build-system.requires
Maybe there are situations where you only want to install the optional dependencies without the project dependencies. So I would extend @sbidoul suggestion with:
pip install --only-optional-deps .[doc] # only project.optional-dependencies.doc
Maybe there are situations where you only want to install the optional dependencies without the project dependencies.
I'm not comfortable with that. Indeed extras are additive to the base dependencies by definition, so such a mechanism sounds a bit awkward to me.
The desire for this feature just came up in a discussion at work around whether dependencies should be recorded in requirements.txt (since that's where pip has historically put them) or in pyproject.toml (as that's backed by a spec)? And the problem with the latter is it requires you set up a build system so you can do pip install -e ., even if you didn't need one and just want to install stuff that you wrote down somewhere.
The motivating scenario is beginners who have started coding, have some script or package going (i.e., not doing a src/ layout), and now they want to install something (assume we will create a virtual environment for them). We would like to teach people to record what they install and follow standards where we can, but to use pyproject.toml we need to also get a build system chosen and set up which is another level of complicated. Otherwise we would need to forgo standards and go with requirements.txt and decide what the appropriate worfklow is in that scenario (e.g., do you record in requirements.txt, requirements-dev.txt, dev-requirements.txt, requirements/dev.txt, etc.?).
Without a build system there's no way for pip to determine what the dependencies are, pip doesn't (afaik, and it shouldn't) read dependencies from pyproject.toml (other than build dependencies), it just looks for a build system and then asks the build system what the dependencies are. It is the responsibility of the build system to read pyproject.toml and determine what (if any) dependencies there are.
@brettcannon In the use case, what is the motivation behind not including . in the installation?
Without a build system there's no way for pip to determine what the dependencies are
If dependencies was listed in dynamic I would agree, but if it's not then how it is different than a requirements file?
It is the responsibility of the build system to read
pyproject.tomland determine what (if any) dependencies there are.
Are you saying we need to come up with a separate standard to list arbitrary dependencies like requirements files allow for (this is not the same a lock file to me; think of it as the input file of your top-level dependencies for a lock file generation tool)?
what is the motivation behind not including
.in the installation?
Beginners don't typically need it (i.e., they aren't doing a src/ layout), and I personally don't want to try and educate a beginner trying to install something on why they suddenly need to select a pure Python build backend to record what they installed and how to make an appropriate decision.
If
dependencieswas listed indynamicI would agree, but if it's not then how it is different than a requirements file?
Yes, it's technically true that if the pyproject.toml states that the dependencies are non-dynamic, then reading that file for the data is valid. But during the discussions on PEP 621, the idea of relying on metadata from pyproject.toml like this was pretty controversial. I don't have a specific reference, but I do recall being told that the intended use was very much for backends to read the data, not for it to be the canonical source (I proposed PEP 643 mainly because reading pyproject.toml from the sdist was "not the intended use").
I understand the use case, and in a practical sense, getting the data from pyproject.toml would work (until someone wants it to work with dynamic dependencies, and we have a debate over why that is out of scope). But I don't think it's the right way for pip to go.
Is there a reason this couldn't be an external tool?
# Warning: code has only been lightly tested!
with open(FILE, "rb") as f:
data = tomllib.load(f)
if "project" not in data:
raise ValueError("No PEP 621 metadata in pyproject.toml")
if "dependencies" in data["project"].get("dynamic", []):
raise ValueError("Dependencies cannot be dynamic")
deps = data["project"].get("dependencies")
if deps:
cmd = [sys.executable, "-m", "pip", "install", *deps]
subprocess.run(cmd)
Is there a reason this couldn't be an external tool?
Everything can be an external tool 😉, but at least for VS Code we have a general policy of trying not to introduce our own custom tooling so people can operate outside of VS Code without issue. So if we created an install-from-pyproject and started to record what (primarily) beginners wanted to be installed in pyproject.toml, then that workflow suddenly becomes rather specific to us as we are now driving the workflow instead of the user.
If the answer from pip is, "use requirements files," then I totally understand and accept that as that's pip's mechanism for this sort of thing. But it also means that I will probably have to develop some new standard for feeding dependencies into a lock file tool since right now it seems like all that tool could take is what's provided on the command-line (although ironically pip-tools now works with pyproject.toml, so using the file for this sort of thing might be decided for us 😅).
My impression reading from the above is the real hurdle is actually making the code a proper Python package (with the build system etc. defined), not to install only dependencies of a Python package. The latter is in itself an entirely separate valid use case, but can be more properly covered by external tooling. So maybe what we really need is an accompanying file format that mirrors PEP 621, but does not express a Python package by forbidding dynamic (everything must be static), and making everything else advisory (no required fields, even the file name can be different). That way we can have say pip install --dependencies /path/to/pyproject.toml to read dependencies and optional-dependencies.
I was thinking about this more, and it occurred to me that I don't think it actually exists to have a pyproject.toml without a build system.
PEP 518 says that it is expected if the build-system.requires key is missing, that tools will treat that as if ["setuptools", "wheel"] were defined.
PEP 517 says that if build-system.build-backend isn't defined, then tools will treat that as if the project is using the legacy setup.py path, either by directly invoking setup.py or using setuptools.build_meta:__legacy__.
This it is my assertion that any directory that has a pyproject.toml, implicitly has a build backend of setuptools, and this matches what is implemented today in pip.
Likewise, since setuptools implements PEP 621, the following pyproject.toml is a valid pyproject:
[project]
name = "test"
version = "1.0"
dependencies = ["requests"]
So I guess, in a way, what @brettcannon wants exists already (other than the --only-deps flag), and it's implemented with the abstraction layers still being clean. I'm not sure if "implicitly use setuptools" counts as having to teach beginners about build backends or not?
Also note that it's not a valid PEP 621 file if it doesn't have a name and version specified (either dynamically or statically for version, and only statically for name). This means that it's not possible to create a valid pyproject.toml that uses project.dependencies without making it into a minimal valid package.
pip-tools now works with pyproject.toml, so using the file for this sort of thing might be decided for us
pip-tools isn't reading the pyproject.toml, it's calling the build backend (by default setuptools) and asking it to produce a list of dependencies, and then generating a lockfile from that.
don't want to try and educate a beginner trying to install something on why they suddenly need to select a pure Python build backend to record what they installed and how to make an appropriate decision.
You don't necessarily need to tell beginners about build backends since there is a default one.
A pyproject.toml without build system and only name, version and dependencies is valid and easy to teach.
pip install --only-deps needs to use the build backend to obtain dependencies in the general case. It could also read static dependencies from pyproject.toml, but that is an optimization / implementation detail.
I think the only drawback is these pesky .egg-info directories that show up, but I understand setuptools has long terms plans to get rid of them in some cases? https://github.com/pypa/setuptools/issues/3573#issuecomment-1539728150
So personally I think pyproject.toml is definitely the way to go to declare top level dependencies. BTW, in my practice, requirements*.txt are the lock files, not top level dependencies.
Ow, looks like I wrote this at the same time as @dstufft :)
Also note that it's not a valid PEP 621 file if it doesn't have a name and version specified (either dynamically or statically for version, and only statically for name). This means that it's not possible to create a valid
pyproject.tomlthat usesproject.dependencieswithout making it into a minimal valid package.
Correct, but setting version = "0" and name to the directory in this new file called pyproject.toml that suddenly appears probably isn't too hard of a stretch to understand on its own.
I'm not sure if "implicitly use setuptools" counts as having to teach beginners about build backends or not?
Somewhat. I can already see the bug report, "Why is VS Code installing my own project when I didn't ask it to?!?" because the debug output has "Successfully installed spam-0" from the setuptools output (and that's skipping over the whole "Building wheel" bits).
I guess my question comes down to what do you expect apps to use these days to record their dependencies (and thus to install them)?
To be clear, I don't really have any problem with a --only-deps (for whatever value my opinion has on the matter). I didn't think we should try and make pyproject.toml into something that wasn't describing a package. I honestly think it's fine if we just tell people that everything is a package, Rust does that just fine with cargo.
Absent that though, requirements.txt is probably still it, and if we want something standardized, it has yet to be created.
Absent that though,
requirements.txtis probably still it, and if we want something standardized, it has yet to be created.
That's what I thought this little digression was going to end up concluding with.
So personally I think
pyproject.tomlis definitely the way to go to declare top level dependencies.
+1 for this, this is the one standardized place and it seems perfectly adequate. And I believe the conversation in this thread had already settled on both that add on the need for --only-deps (discussion was mostly around the optimal syntax for it), before the little detour in the last 1-2 days.
pip install --only-depsneeds to use the build backend to obtain dependencies in the general case. It could also read staticdependenciesfrompyproject.toml, but that is an optimization / implementation detail.
I would say it's a little more than an implementation detail. There is no interface to ask the build backend for this information, so reading static dependencies and optional-dependencies directly clearly seems like the way to go (why make it a complex/new interoperability interface when the data is already right there in a static file?). Dynamic dependencies can just raise an error, they're not supportable by either pyproject.toml or requirements.txt.