pipenv icon indicating copy to clipboard operation
pipenv copied to clipboard

Improving pipenv lock performances

Open fbertola opened this issue 6 years ago • 3 comments

Hi! I was recently bit by the infamous slowness of the lock command. I was thinking that I could improve the performances by just pre-calculating the hashes and dependencies of all our company's projects and modify the underling implementation a bit. Scaling up to the whole project, I think it's overkill to do that a priori on the whole PyPi database, but maybe we could apply a mechanism that upload your local cache to a central location, every now and then. It wouldn't be that hard to develop and maintain (provided a place to store the data) and, with time, it will cover pretty much all the python packages currently used. I could surely help in this regards as I'm doing something similar in my spare time.

What do you think?

fbertola avatar Dec 31 '18 15:12 fbertola

It will be nice for pypi to do that hash calculation, and even for dependency resolution. :D

But it's not and we can make our own. It will be so cool if we could make this dependency resolution as a service, and I'm definitely +1 for it!

jxltom avatar Jan 03 '19 01:01 jxltom

In addition to caching hashes from pypi (pip already caches downloaded packages), it's probably worth caching the set of dependencies of local editable package installs. We have a project with a load of editable packages installed in a venv, it's nice to work in. However, pipenv lock takes forever and part of this is that it queries each editable package multiple times (which builds a wheel etc.), on our pretty beefy workstations each iteration takes ~1s which very quickly adds up with many editable packages and many pipenv passes as it attempts to find a graph of packages that work together.

stewartmiles avatar Aug 18 '22 21:08 stewartmiles

I think there are two primary things we can do to improve performance overall:
1.) batch up the installs in batch_install instead or writing a temp file per requirement and passing that into a separate pip subprocess -- there is already a working prototype PR for this (the implication is we drop the status bar): https://github.com/pypa/pipenv/pull/5301 2.) We can allow more pypi servers than just pypi.org to support the same style json API that allows us to quickly retrieve package hashes. I worked on a prototype of this as well, but also involved converting the test runner to pypiserver project, that is also available for review: https://github.com/pypa/pipenv/pull/5284

There may be other things, but I see those as the big two, and I have performance numbers on the # 1 PR that are quite promising. Open to feedback, so let me know what you think.

matteius avatar Aug 28 '22 01:08 matteius

image https://lincolnloop.github.io/python-package-manager-shootout/

matteius avatar Mar 04 '23 04:03 matteius