pipenv icon indicating copy to clipboard operation
pipenv copied to clipboard

Pipenv PEP 503 Improvement: Pipenv downloads PyTorch for all versions of Python, grabbing 16GB of data instead of just 1.7GB.

Open Arcitec opened this issue 3 years ago • 14 comments

I recently posted the correct way to install PyTorch as a PEP 503 repository in Pipenv:

https://github.com/pypa/pipenv/issues/4961#issuecomment-1045679643

There's just one annoying issue in Pipenv: It downloads PyTorch for every version of CPython.

So let's say my project is based on pipenv install --python=3.9. And I then run the command to install PyTorch (see guide above for details): pipenv install --extra-index-url https://download.pytorch.org/whl/cu113/ "torch==1.10.1+cu113".

Well, Pipenv then downloads all versions of PyTorch into ~/.cache/pipenv: cp36, cp37, cp38, cp39 and probably a few more. And then it finally installs the intended architecture (torch-1.10.1+cu113-cp39).

This means that the download took 16 GB and 30 minutes, instead of 1.7 GB and 4 minutes. Wasting a ton of disk space and time on downloading extra copies of the library for old Python versions that I'll never use.

I confirmed that the extra downloaded data is versions for old Python releases, because I went into the Pipenv cache and looked inside the hashed archives to check their WHEEL metadata. It was stuff like the "Python 3.6" torch version etc.

I'm using pipenv 2022.1.8.

My guess is that Pipenv's current algorithm just searches PEP 503 repos for packages whose name start with torch-* and downloads them ALL and then looks at the embedded "wheel metadata" in all downloaded archives to figure out which one matches the installed Python version.

Can Pipenv be improved to detect the "cp39" filename hints in PEP 503 repos and only download the version that matches the installed Python version?

Arcitec avatar Feb 19 '22 03:02 Arcitec

@Bananaman I believe the issue here is that the private server https://download.pytorch.org/whl/cu113/ isn't returning the package hashes directly, so pipenv is downloading everything to generate it. If the packages were in pypi, my understanding is the API would return the metadata and nothing would be downloaded.

matteius avatar Feb 19 '22 04:02 matteius

@Bananaman Can you not point at the pypi server? https://pypi.org/project/torch/1.10.1/#files

matteius avatar Feb 19 '22 04:02 matteius

I just tried it with this Pipfile and it doesn't download anything large to ~/.cache/pipenv using the pypi server and finishes locking relatively quickly.

[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
torch = "==1.10.1"

[dev-packages]

[requires]
python_version = "3.9"

matteius avatar Feb 19 '22 04:02 matteius

I think I see now that the reason you are using the other package server is you are looking for a cuda specific version of torch that is not in pypi?

matteius avatar Feb 19 '22 04:02 matteius

@matteius

I think I see now that the reason you are using the other package server is you are looking for a cuda specific version of torch that is not in pypi?

Yeah, my card requires PyTorch built for CUDA Toolkit 11.x, which can only be found at the PyTorch repository.

I believe the issue here is that the private server https://download.pytorch.org/whl/cu113/ isn't returning the package hashes directly, so pipenv is downloading everything to generate it. If the packages were in pypi, my understanding is the API would return the metadata and nothing would be downloaded.

Well there's 2 issues here:

  1. It downloads every package (cp36, cp37, cp38, cp39 and seemingly a few others since the total ended up at 16 GB). It only needs to download cp39 (1.7 GB) no matter what command I give it, since my Python interpreter in that Pipenv folder is Python 3.9. The other packages that pipenv downloaded aren't even compatible with my Python version. So an optimization would be to filter out the other "cp##" versions and not even download/consider them at all. That's the main issue here.
  2. The second issue is the one you mention, which is that Pipenv doesn't know what hashes the files on the server has, so any future re-installation may need the packages to be downloaded and hashed again to check for changes. That's an issue which would cause the huge wait times from the first issue again, since everything would re-download (ouch). Oh and every time the project's CUDA / PyTorch version is updated, it'd cause a huge download of 16 GB of data of all architectures for the next Torch version again... ouch. The pipenv cache would grow extremely large after just a few versions of Torch.

The best fix would be to do "if running under CPython, look for matching identifier in package filenames such as 'cp39' and only download that/those if such an identifier is found".

As far as I have heard, the -cp39- stuff is standardized or at least "the way everyone does it". The pattern is packagename-packageversion-cp##-morestuffandCPUarch. So if filenames follow the packagename-packageversion-cp##- pattern, we can strongly assume that it's an indicator "this is the CPython 3.9 version" and thereby instantly know which packages we can skip from PEP 503 repos.

There's lots of room for improvement of Pipenv's PEP 503 support. Phase 1 could be "Skip every -cp##- version that doesn't match ours. Phase 2 would be to skip every packagename-version that wasn't requested (no need to download 1.10.2 if 1.10.2+cu113 was requested). Phase 3 would be to skip every -architecture (i.e. Linux, Mac, etc) that your system doesn't have.

The most important thing would be to skip the other -cp##- versions because that's a huuuuge amount of data to download.

How feasible is it that Pipenv can be extended to filter out useless downloads? Hopefully the internal code isn't too rigid.

Arcitec avatar Feb 19 '22 04:02 Arcitec

@Bananaman Thanks for your feedback, and I am pretty new here to this code base still but from what I gather about the dependency resolution is that this may require an upstream change somewhere, but I think this is good discussion and could lead to some improvements.

matteius avatar Feb 19 '22 05:02 matteius

Ahh, I see. If someone knows what dependency resolver pipenv uses, we'll know where to file the issue then. :)

Arcitec avatar Feb 19 '22 05:02 Arcitec

@Bananaman I've learned a lot recently -- it uses Pip's dependency resolver. I've done work to get pipenv vendor'd to 22.0.4 (its currently on 21.x) here: https://github.com/pypa/pipenv/pull/4969

However I just tried your example on this branch and I think now you have new issues with the install instructions on the newer pip resolver:

matteius@matteius-VirtualBox:~/shared-projects/pipenv-triage/pipenv-4963$ pipenv install --extra-index-url https://download.pytorch.org/whl/cu113/ "torch==1.10.1+cu113"
Installing torch==1.10.1+cu113...
Error:  An error occurred while installing torch==1.10.1+cu113!
Error text: Looking in indexes: https://download.pytorch.org/whl/cu113/, https://pypi.org/simple

ERROR: Could not find a version that satisfies the requirement torch==1.10.1+cu113 (from versions: 1.11.0, 1.11.0+cu113)
ERROR: No matching distribution found for torch==1.10.1+cu113

✘ Installation Failed 

EDIT: Actually is it possible that a new version has replaced the older one on that URL? Because it seems to have this: pipenv install --extra-index-url https://download.pytorch.org/whl/cu113/ "torch==1.11.0+cu113"

Though that lead to it failing to lock:

matteius@matteius-VirtualBox:~/shared-projects/pipenv-triage/pipenv-4963$ pipenv install --extra-index-url https://download.pytorch.org/whl/cu113/ "torch==1.11.0+cu113" -v --pre
Installing torch==1.11.0+cu113...
Installing package: torch==1.11.0+cu113
Writing supplied requirement line to temporary file: 'torch==1.11.0+cu113'
Installing 'torch'
⠇ Installing torch...$ /home/matteius/.virtualenvs/pipenv-4963-jfl4-XCi/bin/python -m pip install --pre --verbose --upgrade --exists-action=i -r /tmp/pipenv-40s_n39y-requirements/pipenv-bxlerdtu-requirement.txt -i https://download.pytorch.org/whl/cu113/ --extra-index-url https://pypi.org/simple
Using source directory: '/home/matteius/.virtualenvs/pipenv-4963-jfl4-XCi/src'
Error:  An error occurred while installing torch==1.11.0+cu113!
Error text: Using pip 22.0.3 from /home/matteius/.virtualenvs/pipenv-4963-jfl4-XCi/lib/python3.10/site-packages/pip (python 3.10)
Looking in indexes: https://download.pytorch.org/whl/cu113/, https://pypi.org/simple
Collecting torch==1.11.0+cu113


Using pip 22.0.3 from /home/matteius/.virtualenvs/pipenv-4963-jfl4-XCi/lib/python3.10/site-packages/pip (python 3.10)
Looking in indexes: https://download.pytorch.org/whl/cu113/, https://pypi.org/simple
Collecting torch==1.11.0+cu113

✘ Installation Failed 

DOUBLE EDIT: Oh man my last Installation Failed was a result of it using my system python 3.10 and the pre-built wheels there only go up to python3.9. Tryin again now with pipenv install --python=python3.9 TRIPLE EDIT: Actually there are 3.10 wheels there and the 3.9 install also fails quickly. I don't know yet but you stumped me for tonight.

matteius avatar Mar 12 '22 09:03 matteius

@Bananaman Here is what I am guessing -- my system laptop doesn't have the right dependencies to install +cu113 version, however the pip 22.0.4 branch I link you to in my prior comment -- please try that branch and report back what your experience is. Pro tip for installing from the branch: python setup.py develop --user

I tried also the +cpu but no distribution was found. Trying now just straight pipenv install --extra-index-url https://download.pytorch.org/whl/ "torch==1.11.0" which is spending more time on locking now -- will edit this comment when its done.

matteius avatar Mar 12 '22 09:03 matteius

@Bananaman Ok so +cpu works to install for my VM, and it was fairly quick.

pipenv install --python=python3.9
matteius@matteius-VirtualBox:~/shared-projects/pipenv-triage/pipenv-4963$ pipenv install --extra-index-url https://download.pytorch.org/whl/ "torch==1.11.0+cpu"
Installing torch==1.11.0+cpu...
Adding torch to Pipfile's [packages]...
✔ Installation Succeeded 
Pipfile.lock (16c839) out of date, updating to (8a8c19)...
Locking [dev-packages] dependencies...
Locking [packages] dependencies...
Building requirements...
Resolving dependencies...
✔ Success! 
Updated Pipfile.lock (8a8c19)!
Installing dependencies from Pipfile.lock (8a8c19)...
  🐍   ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 0/0 — 00:00:00
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[[source]]
url = "https://download.pytorch.org/whl/"
verify_ssl = true
name = "downloadpytorch"

[packages]
torch = "==1.11.0+cpu"

[dev-packages]

[requires]
python_version = "3.9"
{
    "_meta": {
        "hash": {
            "sha256": "4942cb72a13b441c50651b5bbbe8e3ec6642cae96ec0c014a4a156457e8a8c19"
        },
        "pipfile-spec": 6,
        "requires": {
            "python_version": "3.9"
        },
        "sources": [
            {
                "name": "pypi",
                "url": "https://pypi.org/simple",
                "verify_ssl": true
            },
            {
                "name": "downloadpytorch",
                "url": "https://download.pytorch.org/whl/",
                "verify_ssl": true
            }
        ]
    },
    "default": {
        "torch": {
            "hashes": [
                "sha256:0dbdddc7452a2c42250df369e4968b62589ab0ac1b9d14e27701eb4fc3839ad1",
                "sha256:22997df8f3a3f9faed40ef9e7964d1869cafa0317cc4a5b115bfdf69323e8884",
                "sha256:32fa00d974707c0183bc4dd0c1d69e853d0f15cc60f157b71ac5718847808943",
                "sha256:50008b82004b9d91e036cc199a57f863b6f8978b8a222176f9a4435fce181dd8",
                "sha256:544c13ef120531ec2f28a3c858c06e600d514a6dfe09b4dd6fd0262088dd2fa3",
                "sha256:7198bf5c69464459bd79526c6a4eaad2806db886443ee2f4e8e7a492bccf03ef",
                "sha256:7bbd8b77a59e628a7cb84289a3a26adc7e28dd7213c7f666537f26e714fb1721",
                "sha256:bd984fa8676b2f7c9611b40af3a7c168fb90be3e29028219f822696bb357f472"
            ],
            "index": "pypi",
            "version": "==1.11.0+cpu"
        },
        "typing-extensions": {
            "hashes": [
                "sha256:1a9462dcc3347a79b1f1c0271fbe79e844580bb598bafa1ed208b94da3cdcd42",
                "sha256:21c85e0fe4b9a155d0799430b0ad741cdce7e359660ccbd8b530613e8df88ce2"
            ],
            "markers": "python_version >= '3.6'",
            "version": "==4.1.1"
        }
    },
    "develop": {}
}

matteius avatar Mar 12 '22 09:03 matteius

@Bananaman there was this ticket I worked yesterday/today and it shed some light on indexes, plus I realized I have nvidia on my laptop just not on the VM, so I am doing some experiments within Windows now. I had a test run that was very quick and generated this file, but there was no hash for the package pytorch

{
    "_meta": {
        "hash": {
            "sha256": "e14ddc38e9eb9a8643bfc234e743e5c8fecc48f05a5bfd15199893df96ab4c14"
        },
        "pipfile-spec": 6,
        "requires": {
            "python_version": "3.10"
        },
        "sources": [
            {
                "name": "pypi",
                "url": "https://pypi.org/simple",
                "verify_ssl": true
            },
            {
                "name": "downloadpytorch",
                "url": "https://download.pytorch.org/whl/cu113/",
                "verify_ssl": true
            }
        ]
    },
    "default": {
        "torch": {
            "index": "downloadpytorch",
            "version": "==1.11.0+cu113"
        },
        "typing-extensions": {
            "hashes": [
                "sha256:1a9462dcc3347a79b1f1c0271fbe79e844580bb598bafa1ed208b94da3cdcd42",
                "sha256:21c85e0fe4b9a155d0799430b0ad741cdce7e359660ccbd8b530613e8df88ce2"
            ],
            "markers": "python_version >= '3.6'",
            "version": "==4.1.1"
        }
    },
    "develop": {}
}

So then I tried an experimental branch locally of the pip 22.0.4 resolver updates combined with my other PR for index resolving fixes. This time I watched my network router and I saw it download the 16+ GB over several minutes of waiting, which definitely took 8x longer than whatever generated the above Pipfile.lock. However one difference is that the new Pipfile.lock does have all of the hashes and there are 8 hashes so that explains why it took so much longer.

$ cat Pipfile.lock
{
    "_meta": {
        "hash": {
            "sha256": "e14ddc38e9eb9a8643bfc234e743e5c8fecc48f05a5bfd15199893df96ab4c14"
        },
        "pipfile-spec": 6,
        "requires": {
            "python_version": "3.10"
        },
        "sources": [
            {
                "name": "pypi",
                "url": "https://pypi.org/simple",
                "verify_ssl": true
            },
            {
                "name": "downloadpytorch",
                "url": "https://download.pytorch.org/whl/cu113/",
                "verify_ssl": true
            }
        ]
    },
    "default": {
        "torch": {
            "hashes": [
                "sha256:7fd4751bbf39bbb04ec6116c7243ce6528aded4197afcf380537340e1eebd19a",
                "sha256:a68c33657a546131eb9bc44e2a98d2fa704aafae861460b051b82813852ccb44",
                "sha256:b6a799bdb6ee3d914e5e62bddb4276d4a10248c1af4f2d217738e5f9ee27485b",
                "sha256:ddc57495195aa2456e78bfc7d8d3f45dabbb8b7b268b3b5dbed4f0e4db492f33",
                "sha256:e4bb14d953db9aad5bdb945a328410638721d77e3e622d0a8d77063c01daf40b",
                "sha256:e9126b0a5d95704bee40a9d0ef1cbd82d8dc7863e4638a376bef702dfd659370",
                "sha256:e9df65c1fb2d80283b276114878fd38f411b70880e0b406c451d000e6159f451",
                "sha256:f56333470daea3c97078b37607e0035cccf72fc5c36fd84546e1a4b8d9944f2b"
            ],
            "index": "downloadpytorch",
            "version": "==1.11.0+cu113"
        },
        "typing-extensions": {
            "hashes": [
                "sha256:1a9462dcc3347a79b1f1c0271fbe79e844580bb598bafa1ed208b94da3cdcd42",
                "sha256:21c85e0fe4b9a155d0799430b0ad741cdce7e359660ccbd8b530613e8df88ce2"
            ],
            "markers": "python_version >= '3.6'",
            "version": "==4.1.1"
        }
    },
    "develop": {}
}

But then I re-ran it the way I had it prior with specifying the --index based on my learnings and the fixes in my newer branches and this was fast, didn't download anything actually (maybe the wheels are already cached?)

$ pipenv install --index https://download.pytorch.org/whl/cu113/ "torch==1.11.0+c    " -v
Installing torch==1.11.0+cu113...
Installing package: torch==1.11.0+cu113
Writing supplied requirement line to temporary file: 'torch==1.11.0+cu113'
Installing 'torch'
[    ] Installing torch...$ 'C:\Users\matte\.virtualenvs\pipenv-4963-oaRoupN9\Scripts\python.exe' -m pip install --verbose --upgra
de --exists-action=i -r 'c:\users\matte\appdata\local\temp\pipenv-_d5v9vob-requirements\pipenv-bmq9v45c-requirement.txt' -i https:
//download.pytorch.org/whl/cu113/ --extra-index-url https://pypi.org/simple --extra-index-url https://pypi.org/simple
Using source directory: 'C:\\Users\\matte\\.virtualenvs\\pipenv-4963-oaRoupN9\\src'
Adding torch to Pipfile's [packages]...
Installation Succeeded
Pipfile.lock not found, creating...
Locking [dev-packages] dependencies...
Locking [packages] dependencies...
           Building requirements...
Resolving dependencies...
Reporter.starting()
INFO:pipenv.patched.notpip._internal.resolution.resolvelib.reporter:Reporter.starting()
Reporter.adding_requirement(SpecifierRequirement('torch==1.11.0+cu113'), None)
INFO:pipenv.patched.notpip._internal.resolution.resolvelib.reporter:Reporter.adding_requirement(SpecifierRequirement('torch==1.11.
0+cu113'), None)
Reporter.starting_round(0)
INFO:pipenv.patched.notpip._internal.resolution.resolvelib.reporter:Reporter.starting_round(0)
Reporter.adding_requirement(SpecifierRequirement('typing-extensions'), LinkCandidate('https://download.pytorch.org/whl/cu113/torch
-1.11.0%2Bcu113-cp310-cp310-win_amd64.whl (from https://download.pytorch.org/whl/cu113/torch/)'))
INFO:pipenv.patched.notpip._internal.resolution.resolvelib.reporter:Reporter.adding_requirement(SpecifierRequirement('typing-exten
sions'), LinkCandidate('https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp310-cp310-win_amd64.whl (from https://downlo
ad.pytorch.org/whl/cu113/torch/)'))
Reporter.pinning(LinkCandidate('https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp310-cp310-win_amd64.whl (from https:
//download.pytorch.org/whl/cu113/torch/)'))
INFO:pipenv.patched.notpip._internal.resolution.resolvelib.reporter:Reporter.pinning(LinkCandidate('https://download.pytorch.org/w
hl/cu113/torch-1.11.0%2Bcu113-cp310-cp310-win_amd64.whl (from https://download.pytorch.org/whl/cu113/torch/)'))
Reporter.ending_round(0, state)
INFO:pipenv.patched.notpip._internal.resolution.resolvelib.reporter:Reporter.ending_round(0, state)
Reporter.starting_round(1)
INFO:pipenv.patched.notpip._internal.resolution.resolvelib.reporter:Reporter.starting_round(1)
Reporter.pinning(LinkCandidate('https://files.pythonhosted.org/packages/45/6b/44f7f8f1e110027cf88956b59f2fad776cca7e1704396d043f89
effd3a0e/typing_extensions-4.1.1-py3-none-any.whl#sha256=21c85e0fe4b9a155d0799430b0ad741cdce7e359660ccbd8b530613e8df88ce2 (from ht
tps://pypi.org/simple/typing-extensions/) (requires-python:>=3.6)'))
INFO:pipenv.patched.notpip._internal.resolution.resolvelib.reporter:Reporter.pinning(LinkCandidate('https://files.pythonhosted.org
/packages/45/6b/44f7f8f1e110027cf88956b59f2fad776cca7e1704396d043f89effd3a0e/typing_extensions-4.1.1-py3-none-any.whl#sha256=21c85
e0fe4b9a155d0799430b0ad741cdce7e359660ccbd8b530613e8df88ce2 (from https://pypi.org/simple/typing-extensions/) (requires-python:>=3
.6)'))
Reporter.ending_round(1, state)
INFO:pipenv.patched.notpip._internal.resolution.resolvelib.reporter:Reporter.ending_round(1, state)
Reporter.starting_round(2)
INFO:pipenv.patched.notpip._internal.resolution.resolvelib.reporter:Reporter.starting_round(2)
Reporter.ending(State(mapping=OrderedDict([('torch', LinkCandidate('https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp
310-cp310-win_amd64.whl (from https://download.pytorch.org/whl/cu113/torch/)')), ('typing-extensions', LinkCandidate('https://file
s.pythonhosted.org/packages/45/6b/44f7f8f1e110027cf88956b59f2fad776cca7e1704396d043f89effd3a0e/typing_extensions-4.1.1-py3-none-an
y.whl#sha256=21c85e0fe4b9a155d0799430b0ad741cdce7e359660ccbd8b530613e8df88ce2 (from https://pypi.org/simple/typing-extensions/) (r
equires-python:>=3.6)'))]), criteria={'torch': Criterion((SpecifierRequirement('torch==1.11.0+cu113'), via=None)), 'typing-extensi
ons': Criterion((SpecifierRequirement('typing-extensions'), via=LinkCandidate('https://download.pytorch.org/whl/cu113/torch-1.11.0
%2Bcu113-cp310-cp310-win_amd64.whl (from https://download.pytorch.org/whl/cu113/torch/)')))}))
INFO:pipenv.patched.notpip._internal.resolution.resolvelib.reporter:Reporter.ending(State(mapping=OrderedDict([('torch', LinkCandi
date('https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp310-cp310-win_amd64.whl (from https://download.pytorch.org/whl
/cu113/torch/)')), ('typing-extensions', LinkCandidate('https://files.pythonhosted.org/packages/45/6b/44f7f8f1e110027cf88956b59f2f
ad776cca7e1704396d043f89effd3a0e/typing_extensions-4.1.1-py3-none-any.whl#sha256=21c85e0fe4b9a155d0799430b0ad741cdce7e359660ccbd8b
530613e8df88ce2 (from https://pypi.org/simple/typing-extensions/) (requires-python:>=3.6)'))]), criteria={'torch': Criterion((Spec
ifierRequirement('torch==1.11.0+cu113'), via=None)), 'typing-extensions': Criterion((SpecifierRequirement('typing-extensions'), vi
a=LinkCandidate('https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp310-cp310-win_amd64.whl (from https://download.pyto
rch.org/whl/cu113/torch/)')))}))
Warning: Error generating hash for torch # This warning specifically comes from a method called `_get_hashes_from_pypi` and my branch restricts the index when specified, so you can probably ignore this part
[  ==] Locking...
Success!
Updated Pipfile.lock (ab4c14)!
Installing dependencies from Pipfile.lock (ab4c14)...
  ================================ 0/0 - 00:00:00
To activate this project's virtualenv, run pipenv shell.
Alternatively, run a command inside the virtualenv with pipenv run.

matte@LAPTOP-N5VSGIBD MINGW64 ~/Projects/pipenv-triage/pipenv-4963
$ cat Pipfile.lock
{
    "_meta": {
        "hash": {
            "sha256": "e14ddc38e9eb9a8643bfc234e743e5c8fecc48f05a5bfd15199893df96ab4c14"
        },
        "pipfile-spec": 6,
        "requires": {
            "python_version": "3.10"
        },
        "sources": [
            {
                "name": "pypi",
                "url": "https://pypi.org/simple",
                "verify_ssl": true
            },
            {
                "name": "downloadpytorch",
                "url": "https://download.pytorch.org/whl/cu113/",
                "verify_ssl": true
            }
        ]
    },
    "default": {
        "torch": {
            "hashes": [
                "sha256:7fd4751bbf39bbb04ec6116c7243ce6528aded4197afcf380537340e1eebd19a",
                "sha256:a68c33657a546131eb9bc44e2a98d2fa704aafae861460b051b82813852ccb44",
                "sha256:b6a799bdb6ee3d914e5e62bddb4276d4a10248c1af4f2d217738e5f9ee27485b",
                "sha256:ddc57495195aa2456e78bfc7d8d3f45dabbb8b7b268b3b5dbed4f0e4db492f33",
                "sha256:e4bb14d953db9aad5bdb945a328410638721d77e3e622d0a8d77063c01daf40b",
                "sha256:e9126b0a5d95704bee40a9d0ef1cbd82d8dc7863e4638a376bef702dfd659370",
                "sha256:e9df65c1fb2d80283b276114878fd38f411b70880e0b406c451d000e6159f451",
                "sha256:f56333470daea3c97078b37607e0035cccf72fc5c36fd84546e1a4b8d9944f2b"
            ],
            "index": "downloadpytorch",
            "version": "==1.11.0+cu113"
        },
        "typing-extensions": {
            "hashes": [
                "sha256:1a9462dcc3347a79b1f1c0271fbe79e844580bb598bafa1ed208b94da3cdcd42",
                "sha256:21c85e0fe4b9a155d0799430b0ad741cdce7e359660ccbd8b530613e8df88ce2"
            ],
            "markers": "python_version >= '3.6'",
            "version": "==4.1.1"
        }
    },
    "develop": {}
}

Moving out the lockfile and then regenerating it was very quick and downloaded nothing, I am thinking because its still in cache somewhere. I haven't had great luck finding these large files on my windows file system however.

$ mv Pipfile.lock Pipfile.lock.very_good

matte@LAPTOP-N5VSGIBD MINGW64 ~/Projects/pipenv-triage/pipenv-4963
$ pipenv lock
Locking [dev-packages] dependencies...
Locking [packages] dependencies...
           Building requirements...
Resolving dependencies...
Success!
Updated Pipfile.lock (ab4c14)!

matte@LAPTOP-N5VSGIBD MINGW64 ~/Projects/pipenv-triage/pipenv-4963
$ cat Pipfile.lock
{
    "_meta": {
        "hash": {
            "sha256": "e14ddc38e9eb9a8643bfc234e743e5c8fecc48f05a5bfd15199893df96ab4c14"
        },
        "pipfile-spec": 6,
        "requires": {
            "python_version": "3.10"
        },
        "sources": [
            {
                "name": "pypi",
                "url": "https://pypi.org/simple",
                "verify_ssl": true
            },
            {
                "name": "downloadpytorch",
                "url": "https://download.pytorch.org/whl/cu113/",
                "verify_ssl": true
            }
        ]
    },
    "default": {
        "torch": {
            "hashes": [
                "sha256:7fd4751bbf39bbb04ec6116c7243ce6528aded4197afcf380537340e1eebd19a",
                "sha256:a68c33657a546131eb9bc44e2a98d2fa704aafae861460b051b82813852ccb44",
                "sha256:b6a799bdb6ee3d914e5e62bddb4276d4a10248c1af4f2d217738e5f9ee27485b",
                "sha256:ddc57495195aa2456e78bfc7d8d3f45dabbb8b7b268b3b5dbed4f0e4db492f33",
                "sha256:e4bb14d953db9aad5bdb945a328410638721d77e3e622d0a8d77063c01daf40b",
                "sha256:e9126b0a5d95704bee40a9d0ef1cbd82d8dc7863e4638a376bef702dfd659370",
                "sha256:e9df65c1fb2d80283b276114878fd38f411b70880e0b406c451d000e6159f451",
                "sha256:f56333470daea3c97078b37607e0035cccf72fc5c36fd84546e1a4b8d9944f2b"
            ],
            "index": "downloadpytorch",
            "version": "==1.11.0+cu113"
        },
        "typing-extensions": {
            "hashes": [
                "sha256:1a9462dcc3347a79b1f1c0271fbe79e844580bb598bafa1ed208b94da3cdcd42",
                "sha256:21c85e0fe4b9a155d0799430b0ad741cdce7e359660ccbd8b530613e8df88ce2"
            ],
            "markers": "python_version >= '3.6'",
            "version": "==4.1.1"
        }
    },
    "develop": {}
}

Removing the virtualenv with pipenv --rm and re-running pipenv lock also very quick to generate the lockfile again, likely because its in caches still.

Here is what my generated Pipfile looks like from the initial command: apipenv install --index https://download.pytorch.org/whl/cu113/ "torch==1.11.0+cu113" -v

$ cat Pipfile
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[[source]]
url = "https://download.pytorch.org/whl/cu113/"
verify_ssl = true
name = "downloadpytorch"

[packages]
torch = {version = "==1.11.0+cu113", index = "downloadpytorch"}

[dev-packages]

[requires]
python_version = "3.10"

If you are interested in trying out this branch to see if it has improvements for your issue, I have pushed it out here, its called issue-4637-pip-22.0.4
The thing to note too is you want to use --index over --extra-index-url because extra will just add to the indexes it can pick from whereas index will restrict it. Looking forward to your feedback!

matteius avatar Mar 15 '22 06:03 matteius

Also noting that I did a followup where I had hoped adding the markers for python_version would restrict the hash search like you are saying, but it does not. torch = {version = "==1.11.0+cu113", index = "downloadpytorch", markers = "python_version == '3.10'"} yields just the same number of hashes even though only 2 of them should match the markers:

...
        "torch": {
            "hashes": [
                "sha256:7fd4751bbf39bbb04ec6116c7243ce6528aded4197afcf380537340e1eebd19a",
                "sha256:a68c33657a546131eb9bc44e2a98d2fa704aafae861460b051b82813852ccb44",
                "sha256:b6a799bdb6ee3d914e5e62bddb4276d4a10248c1af4f2d217738e5f9ee27485b",
                "sha256:ddc57495195aa2456e78bfc7d8d3f45dabbb8b7b268b3b5dbed4f0e4db492f33",
                "sha256:e4bb14d953db9aad5bdb945a328410638721d77e3e622d0a8d77063c01daf40b",
                "sha256:e9126b0a5d95704bee40a9d0ef1cbd82d8dc7863e4638a376bef702dfd659370",
                "sha256:e9df65c1fb2d80283b276114878fd38f411b70880e0b406c451d000e6159f451",
                "sha256:f56333470daea3c97078b37607e0035cccf72fc5c36fd84546e1a4b8d9944f2b"
            ],
            "index": "downloadpytorch",
            "markers": "python_version == '3.10'",
            "version": "==1.11.0+cu113"
        },
...

matteius avatar Mar 15 '22 06:03 matteius

@Bananaman Also noting that it may be a reasonable workaround to target a very specific wheels file in this case. For example, I tried: pipenv install https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp310-cp310-win_amd64.whl which generated this fresh Pipfile:

[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
torch = {file = "https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp310-cp310-win_amd64.whl"}

[dev-packages]

[requires]
python_version = "3.10"

Then the Pipfile.lock contains just the hash for the wheel I installed:

{
    "_meta": {
        "hash": {
            "sha256": "e0b940becd2658a3e49854df92d24ace51eee0e291ae863f1182978e1524444f"
        },
        "pipfile-spec": 6,
        "requires": {
            "python_version": "3.10"
        },
        "sources": [
            {
                "name": "pypi",
                "url": "https://pypi.org/simple",
                "verify_ssl": true
            }
        ]
    },
    "default": {
        "torch": {
            "file": "https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp310-cp310-win_amd64.whl",
            "hashes": [
                "sha256:ddc57495195aa2456e78bfc7d8d3f45dabbb8b7b268b3b5dbed4f0e4db492f33"
            ],
            "index": "pypi",
            "version": "==1.11.0+cu113"
        },
        "typing-extensions": {
            "hashes": [
                "sha256:1a9462dcc3347a79b1f1c0271fbe79e844580bb598bafa1ed208b94da3cdcd42",
                "sha256:21c85e0fe4b9a155d0799430b0ad741cdce7e359660ccbd8b530613e8df88ce2"
            ],
            "markers": "python_version >= '3.6'",
            "version": "==4.1.1"
        }
    },
    "develop": {}
}

I am not sure what the level of effort would be to get the markers (python_version and system) to restrict which wheels, but I suspect the level of effort is high. The net outcome would be worth it, but without more analysis of the code, I am not sure if this is another case where patching pip resolver itself would be required.

matteius avatar Mar 15 '22 06:03 matteius

https://github.com/pytorch/pytorch/issues/76557

matteius avatar Sep 08 '22 05:09 matteius

Two updates on this front: 1.) The release today re-uses cached wheels from prior lock or install cycles so its much faster to relock now. 2.) If Meta ever supports the S3 bucket changes, the pytorch index will have the sha256 hashes in the URL once this PR gets merged: https://github.com/pytorch/builder/pull/1433

As I a result, I think we can close this as completed.

matteius avatar Aug 26 '23 05:08 matteius

Happy to report that I worked on a PR to assist pytorch in adding hashes to the pytorch indexes and they finished the back population today, now locking pytorch is much faster on latest pipenv versions.

matteius avatar Oct 06 '23 23:10 matteius