pip-compile-multi icon indicating copy to clipboard operation
pip-compile-multi copied to clipboard

Handling of PyTorch CPU versions

Open PeterJCLaw opened this issue 2 years ago • 4 comments

PyTorch has versions like 2.0.0 but also 2.0.0+cpu. I'm not sure what the term for the +cpu part is, though they don't seem to be impacted by using --forbid-post so I'm guessing that they're not seen as post releases.

PyTorch uses the +cpu tag on Linux to provide CPU-only packages which are considerably smaller than the equivalent GPU package. These are available with --extra-index-url https://download.pytorch.org/whl/cpu.

Unfortunately those packages don't exist for MacOS, so including torch==2.0.0+cpu in a requirements file breaks things for developers on macs.

However you can include just --extra-index-url https://download.pytorch.org/whl/cpu and torch == 2.0.0 in requirements files and they'll work fine for Linux users and for Mac users. Notably in this case Linux users will get the CPU-optimised package from the custom index, which will install as 2.0.0+cpu.

Clearly the way that this is working is a bit funky already. I'm not completely sure I like it, however I suspect PyTorch is big enough that getting them to change it is unlikely.

pip-compile seems quite happy to leave torch==2.0.0 in requirements files where that is pinned (in the input), however pip-compile-multi does not do so if there is a sibling package which also pulls in torch.

Thus:

--extra-index-url https://download.pytorch.org/whl/cpu

transformers[torch]
torch == 2.0.0
With pip-compile
#
# This file is autogenerated by pip-compile with Python 3.9
# by the following command:
#
#    pip-compile base.in
#
--extra-index-url https://download.pytorch.org/whl/cpu

accelerate==0.19.0
    # via transformers
certifi==2023.5.7
    # via requests
charset-normalizer==3.1.0
    # via requests
filelock==3.12.0
    # via
    #   huggingface-hub
    #   torch
    #   transformers
fsspec==2023.5.0
    # via huggingface-hub
huggingface-hub==0.14.1
    # via transformers
idna==3.4
    # via requests
jinja2==3.1.2
    # via torch
markupsafe==2.1.2
    # via jinja2
mpmath==1.3.0
    # via sympy
networkx==3.1
    # via torch
numpy==1.24.3
    # via
    #   accelerate
    #   transformers
packaging==23.1
    # via
    #   accelerate
    #   huggingface-hub
    #   transformers
psutil==5.9.5
    # via accelerate
pyyaml==6.0
    # via
    #   accelerate
    #   huggingface-hub
    #   transformers
regex==2023.5.5
    # via transformers
requests==2.31.0
    # via
    #   huggingface-hub
    #   transformers
sympy==1.12
    # via torch
tokenizers==0.13.3
    # via transformers
torch==2.0.0
    # via
    #   -r base.in
    #   accelerate
    #   transformers
tqdm==4.65.0
    # via
    #   huggingface-hub
    #   transformers
transformers[torch]==4.29.2
    # via -r base.in
typing-extensions==4.6.2
    # via
    #   huggingface-hub
    #   torch
urllib3==2.0.2
    # via requests
With pip-compile-multi
# SHA1:f77d2efafbc3749515b31f61241a7689159cf347
#
# This file is autogenerated by pip-compile-multi
# To update, run:
#
#    pip-compile-multi
#
accelerate==0.19.0
    # via transformers
certifi==2023.5.7
    # via requests
charset-normalizer==3.1.0
    # via requests
filelock==3.12.0
    # via
    #   huggingface-hub
    #   torch
    #   transformers
fsspec==2023.5.0
    # via huggingface-hub
huggingface-hub==0.14.1
    # via transformers
idna==3.4
    # via requests
jinja2==3.1.2
    # via torch
markupsafe==2.1.2
    # via jinja2
mpmath==1.3.0
    # via sympy
networkx==3.1
    # via torch
numpy==1.24.3
    # via
    #   accelerate
    #   transformers
packaging==23.1
    # via
    #   accelerate
    #   huggingface-hub
    #   transformers
psutil==5.9.5
    # via accelerate
pyyaml==6.0
    # via
    #   accelerate
    #   huggingface-hub
    #   transformers
regex==2023.5.5
    # via transformers
requests==2.31.0
    # via
    #   huggingface-hub
    #   transformers
sympy==1.12
    # via torch
tokenizers==0.13.3
    # via transformers
torch==2.0.0+cpu
    # via
    #   -r base.in
    #   accelerate
    #   transformers
tqdm==4.65.0
    # via
    #   huggingface-hub
    #   transformers
transformers[torch]==4.29.2
    # via -r base.in
typing-extensions==4.6.2
    # via
    #   huggingface-hub
    #   torch
urllib3==2.0.2
    # via requests

I would ideally like to be able to specify that I always want the non-+cpu variant of the package to be listed in the requirements file.

PeterJCLaw avatar May 31 '23 16:05 PeterJCLaw

Changing the torch line to torch == 2.0.0, != 2.0.0+cpu does seem to help here, however this forces me to manually pin the version in the input. It also seems to make compilation of my actual project much longer (though not noticeably so for the toy example here).

PeterJCLaw avatar May 31 '23 16:05 PeterJCLaw

Thanks for the thorough explanation. This issue seems pretty complex. And I'm not sure I completely understand it. Do you have a good understanding of how the fix would work?

peterdemin avatar Jun 09 '23 15:06 peterdemin

Not completely. Mostly as I don't understand what kind of thing the +cpu adjustment to the version number is.

I'm guessing that pip-compile doesn't have this issue since it uses the requested names for the package versions, rather than resolved name of what gets installed? If that is indeed what's happening, then following pip-compile's lead and using its naming of the packages would probably work... but without really knowing why that works I don't feel confident it's a great fix. Additionally I can definitely see the argument that using the name of the actually installed package is more correct!

The fuller solution here is probably to support different versions (modulo some checking that they're just different flavours of the same version) on different platforms, though that feels like it's much bigger change.

PeterJCLaw avatar Jun 09 '23 20:06 PeterJCLaw