pip-tools icon indicating copy to clipboard operation
pip-tools copied to clipboard

Always remove extras in compiled files

Open FlorentJeannot opened this issue 2 years ago • 45 comments

Issue

I am opening this issue to discuss if the extras should be kept or removed in the compiled files.

@atugushev made a good summary of the situation:

Currently, we have direct references without extras and pinned packages with extras in requirements.txt, which looks wrong and should be synced in some single way.

We would like to get feedback from the community, do you think we should keep them or remove them and why?

My opinion is that we should remove them, since pip-compile is already listing all packages needed for a project, it seems redundant to me to specify it twice (one time in the extra and one time as a top-level dependency). Also, it's in theory possible to install more packages than those specified in the requirements.txt via the extras. In my opinion the generated .txt file of pip-compile should act as a lock file. The only advantage I can see is that we can easily inspect which dependencies are using extras.

@AndydeCleyre said that the order of installation could matter in some cases such as GDAL which requires numpy to be installed first. I checked if having the extra (gdal[numpy]) in the .txt file was making a difference, and I found that it was not working. You can read this gist if you want to have a look at the tests I've done (there's a conclusion at the end if you don't want to read it all).

Links

Some links about the discussion around this:

Samples from dependency management tools

The goal is to show you the output of different management tools when the project specifies extras. This may help you make a decision on the issue.

For each tool, I installed gdal[numpy]==3.2.2.

pip

Command: pip freeze > requirements.txt

The file contains:

GDAL==3.2.2
numpy==1.22.3

pip-tools

Command: pip-compile

The file contains:

gdal[numpy]==3.2.2
    # via -r requirements.in
numpy==1.22.3
    # via gdal

Pipenv

Command: pipenv lock -r > requirements.txt

The file contains:

-i https://pypi.org/simple
gdal[numpy]==3.2.2
numpy==1.22.3

Poetry

Command: poetry export -o requirements.txt

The file contains:

gdal==3.2.2
numpy==1.22.3; python_version >= "3.8"

Pros and cons

I'll try to collect all your feedbacks to update these lists.

Reasons to keep the extras:

  • We can clearly see which dependencies are using extras (FlorentJeannot)
  • https://github.com/jazzband/pip-tools/issues/1613#issuecomment-1171580279

Reasons to remove the extras:

  • In theory, it's possible that pip-sync or pip could install more packages than what is listed in a .txt file because of extras. I think the output of pip-compile should act as a lock file, so it should only install what's specified in the .txt file. (FlorentJeannot)
  • It's redundant. Packages specified in the extras are also in the top-level dependencies. (FlorentJeannot)
  • https://github.com/jazzband/pip-tools/issues/1613#issuecomment-1171511992

FlorentJeannot avatar Apr 11 '22 22:04 FlorentJeannot

@FlorentJeannot You can remove extras from the output file with the --strip-extras option. This was added in version 6.2.0 to allow creating constraint-compatible requirements files. For example, I have my requirements-dev.in headed with the line -c requirements.txt, so that development requirements don't try to have dependencies that are incompatible with the compiled dependencies for production.

https://github.com/jazzband/pip-tools/releases/tag/6.2.0

This is an important use-case for me, so I wouldn't personally mind having this become the default, but the output you're looking for is already possible to obtain.

ryanhiebert avatar Apr 21 '22 17:04 ryanhiebert

  • I think it is possible for a package to change its installation behavior based on whether an optional dependency, implied via an extra-group, is already installed. e.g. coolproj[alternate-file-layout]
  • @LouisAumaitre might want to comment here; the current inclusion of extras makes it invalid as a constraints file, when using the backtracking resolver

AndydeCleyre avatar Apr 28 '22 21:04 AndydeCleyre

I'm going to copy my comment from #1539 here, which sums up my current thoughts on this:


As long as we are offering --strip-extras and not offering its negative, I'd guess that the default output line format would include extras (where this PR currently strips them).

Now that the constraints syntax is stricter (with the backtracking resolver), I expect it will be much more common for folks to need files without the extras. So I would support a separate PR to do that by default, while offering a new option to include them, e.g. --no-strip-extras/--include-extras.

AndydeCleyre avatar May 19 '22 15:05 AndydeCleyre

I think it is possible for a package to change its installation behavior based on whether an optional dependency, implied via an extra-group, is already installed. e.g. coolproj[alternate-file-layout]

Oooh, this one is rough. If the order of installation of packages matters, that is a real challenge. One that I'd rather nobody ever have to think about. I don't think it'd ever be possible to make that behavior intuitive.

ryanhiebert avatar May 19 '22 16:05 ryanhiebert

@ryanhiebert I agree.

I tried to install GDAL here which depends on numpy. It was really painful to have it working with pip, and I don't think there is an easy solution to reproduce that with pip-tools.

FlorentJeannot avatar May 19 '22 16:05 FlorentJeannot

@FlorentJeannot from this link from your GDAL gist discussion, I'd say it might be best for us to ignore the ordering thing. It's not intended behavior of setuptools, so we wouldn't want to encourage that type of bad behavior from packages.

What is your motivation behind wanting to create a flag that preserves the current behavior? I'd be fine just changing the behavior entirely, but my perspective may not be seeing an important constraint.

ryanhiebert avatar May 19 '22 16:05 ryanhiebert

@ryanhiebert I just wanted to emphasize your last message with this example (that it's a real challenge).

I first suggested to remove the extras in the "compiled" files, because I didn't see the point to have them, and since pip freeze is still not doing it, then I was thinking that it's just not needed.

Then @AndydeCleyre told me that there is an order of installation when we declare an extra to a package. GDAL was mentioned in another thread about this installation order, so I wanted to try it out by myself to see what happens when we try to install GDAL with different package management tools.

Now that I've tried it, my opinion is that packages which depend on an order of installation is something tricky (and it also seems to be a rare thing). The way @AndydeCleyre made it work is not trivial and it's not working for me with pip>=22 because the installation order with extras in this version has changed.

So I still think we should not have extras by default in the "compiled" files. We could have an --include-extras, but why would need that? The extras in the .in files seem enough in my opinion.

FlorentJeannot avatar May 19 '22 18:05 FlorentJeannot

gdal turned out to be a false example here, because they are trying to control build time behavior based on the installed package set, whereas the extras only guarantee installation order, not whether extra-specified deps are installed at build time.

AndydeCleyre avatar May 19 '22 18:05 AndydeCleyre

Agreed with both of you, @FlorentJeannot and @AndydeCleyre . So far as I can see, I think it would be fine to remove extras be the only behavior, and deprecate the --strip-extras flag entirely.

@AndydeCleyre , my question about motivation was intended for you (though I wasn't keeping good track of who I was responding to). Is there some important constraint, other than install order (which we've shown is about a concern that setuptools says should not be considered), that suggests that we should keep the ability to include extras somehow that I'm not seeing?

ryanhiebert avatar May 19 '22 18:05 ryanhiebert

And you answered that question on the PR linked earlier. I'm also fine with keeping some flag and just changing the default behavior.

ryanhiebert avatar May 19 '22 18:05 ryanhiebert

@FlorentJeannot thanks a lot for this awesome analysis and detailed summary!

I'm in favor of stripping extras in requirements.txt:

  • once requirements.in is compiled there is (should be) no difference in installation result whether the requirements.txt was with or without extras
  • requirements.txt without extras can be used as a constraint file in the layered workflow. Currently, users have to run pip-compile --strip-extras
  • fewer bytes and less distracting info in requirements.txt
  • requirements.txt should look more like pip freeze, where there are no extras

atugushev avatar Jun 30 '22 17:06 atugushev

@atugushev I agree we should start stripping extras by default, but

... once requirements.in is compiled there is (should be) no difference in installation result whether the requirements.txt was with or without extras

What about my comment here?

AndydeCleyre avatar Jun 30 '22 18:06 AndydeCleyre

What about my comment here?

@AndydeCleyre the link does not show the comment. Could you quote here?

atugushev avatar Jun 30 '22 18:06 atugushev

@AndydeCleyre the link does not show the comment. Could you quote here?


I'm not saying it's good or common practice, but my understanding is that extras can be used to enforce installation order, and the set of packages already installed can be used by setup.py's install to follow different code paths accordingly.

A hypothetical example:

  • we have a package, mypackage
  • it defines an extra, interactive
  • the same package author also provides a kind of dummy package, mypackage-interactive-installation, required by mypackages's interactive extra
  • during mypackage's install, if and only if mypackage-interactive-installation is already installed, the user is prompted to make some choices which will affect the installation

AndydeCleyre avatar Jun 30 '22 19:06 AndydeCleyre

@AndydeCleyre thanks! That looks like a shoot in the foot 😄 While I understand there are setup.py hackers (historically) in the wild, however, we would never satisfy their needs due to the dynamic nature of setup.py builds. As far as I see the Python world slowly moving towards static metadata (hello setup.cfg/pyproject.toml), I don't see any reason why we shouldn't encourage that.

atugushev avatar Jun 30 '22 20:06 atugushev

I agree with @atugushev

FlorentJeannot avatar Jul 01 '22 07:07 FlorentJeannot

It sounds to me like we have a rough consensus that that changing the default to strip extras is likely appropriate. I think what it needs now is someone to take a stab at making it happen with a pull request. Whether the existing behavior remains is up to the implementer and those that review the pull request. It is possible that having the backward compatibility option (that I'd prefer calling --include-extras if present) could make it easier to agree to change the default when the pull request is created, but its also true that I can't think of a use-case where I'd ever want to use it rather than try to fix what I'd likely consider a broken library.

ryanhiebert avatar Jul 01 '22 13:07 ryanhiebert

@ryanhiebert PR already exists (https://github.com/jazzband/pip-tools/pull/1608) but it does not include the --include-extras that you suggest, for now.

FlorentJeannot avatar Jul 03 '22 13:07 FlorentJeannot

Oh nice, thank you for letting me see that. How do we draw out some consensus of action at this point?

ryanhiebert avatar Jul 03 '22 14:07 ryanhiebert

Proposal :eyes::

  • Release 1:
    • Strip extras by default
    • Deprecate --strip-extras
    • Add flag for including extras
  • Release 2:
    • Remove --strip-extras

Proposal 🚀:

  • Release 1:
    • Strip extras always
    • Deprecate --strip-extras
  • Release 2:
    • Remove --strip-extras

AndydeCleyre avatar Jul 03 '22 22:07 AndydeCleyre

Also keep in mind that whether we want them to or not, there are definitely teams out there using their own parsers on the output of pip-compile, for their particular build processes.

AndydeCleyre avatar Jul 04 '22 18:07 AndydeCleyre

I'll wait for a comment from @atugushev before making any change in my PR.

FlorentJeannot avatar Jul 04 '22 22:07 FlorentJeannot

I'm still in favor of "always stripping extras" and vote for 2nd proposal. My motivation:

  • pip-tools resolves all dependencies, having extras nothing changes (except in rare casees)
  • consistency with pip freeze
  • compatibility with constraints files pip install -c constraints.txt
  • less code

atugushev avatar Oct 06 '22 16:10 atugushev

The only difference between options 1 and 2 suggested by @AndydeCleyre is whether to add --include-extras. It seems to me that there are use cases where it would be useful to include the extras, and in those cases completely removing the ability to keep extras would force using alternative solutions.

Unless the maintenance burden is unusually high, I don't see why you'd want to remove this existing feature.

taleinat avatar Oct 07 '22 10:10 taleinat

FWIW, if pip-compile did always remove extras from the compiled requirements, then I wouldn't have ended up filing pip/#11599.

But I think preserving the extras in the compiled requirements is useful for understanding how the dependency tree was calculated, as I noted over in the pip issue.

ubernostrum avatar Nov 16 '22 00:11 ubernostrum

Why pip-tools should strip extras from pip devs - https://github.com/pypa/pip/issues/11599#issuecomment-1316116546.

atugushev avatar Nov 16 '22 01:11 atugushev

Since pip-compile --resolver backtracking always strips extra and will be the default resolver in 7.0.0 we can make the legacy resolver consistent with backtracking in this version and deprecate --strip-extras (2nd proposal). Is that alright?

Or leave legacy resolver as is and just deprecate the option.

atugushev avatar Nov 16 '22 01:11 atugushev

Using --resolver=backtracking does not currently strip extras. Again: I filed that bug against pip after switching, per the recommendation in the newly-added warning in pip-compile, to --resolver=backtracking.

ubernostrum avatar Nov 16 '22 02:11 ubernostrum

Why pip-tools should strip extras from pip devs - pypa/pip#11599 (comment).

AFAIU, that comment is specifically giving the rationale for excluding extras from valid constraints files, which is not the only type of file generated by pip-tools.

AndydeCleyre avatar Nov 16 '22 16:11 AndydeCleyre

@AndydeCleyre

AFAIU, that comment is specifically giving the rationale for excluding extras from valid constraints files

That's correct. I'd like to mention that under the hood pip-tools passes requirements.txt (if exists) as -с requirements.txt to pip's resolver: https://github.com/jazzband/pip-tools/blob/6870602b29c56916d4b858d7d09bba7cb9f95f8f/piptools/resolver.py#L525-L526

and strips extras: https://github.com/jazzband/pip-tools/blob/6870602b29c56916d4b858d7d09bba7cb9f95f8f/piptools/resolver.py#L538


which is not the only type of file generated by pip-tools.

So essentially requirements.txt is a constraint file and the pip-tools' resolver already follows the comment https://github.com/pypa/pip/issues/11599#issuecomment-1316116546, except it injects extras to requirements.txt back again inconsistently.

atugushev avatar Nov 18 '22 18:11 atugushev