poetry icon indicating copy to clipboard operation
poetry copied to clipboard

Parallelize removing deps in `poetry intall --sync`

Open adriangb opened this issue 1 year ago • 8 comments

Issue Kind

Change in current behaviour

Description

Currently poetry install --sync removes dependencies one by one. It's also somewhat slow for each one (~1s, I'd think it should me ms). Uninstalling should be done in parallel and hopefully can be sped up.

Impact

This enables a workflow where you cache all of the dependencies for a monorepo and then uninstall the ones you don't need in each services' Docker layer.

Workarounds

None that I know of. Just don't use the pattern.

adriangb avatar Apr 03 '24 05:04 adriangb

I see now that the source of this is https://github.com/python-poetry/poetry/issues/2658, but is seems like that was never fully investigated and a lot has changed in the internals of poetry since 2020 (e.g. not calling pip via subprocess, I think?). It seems prime candidate to re-evaluate.

adriangb avatar Apr 03 '24 06:04 adriangb

This was certainly not an accident. https://github.com/python-poetry/poetry/blob/bd029ac0564b7822efbc903320e24ad021567a48/src/poetry/installation/executor.py#L190-L192

Uninstall is one of the few places where poetry does still call pip via a subprocess. I expect a pull request to reimplement uninstallation more directly would be desirable - but more work than just "don't act serially"

dimbleby avatar Apr 03 '24 06:04 dimbleby

Yeah I saw those lines, I was just hoping that "reimplement uninstallation more directly" had already been done so this would be easy. I guess not. Seems like that's what this issue is asking for then.

adriangb avatar Apr 03 '24 06:04 adriangb

@dimbleby if I wanted to start working on this, could you give me some pointers as to where to begin? I tried looking at how install does things but it's a lot of code. Some more general guidance around what to replace pip with would be helpful; I know at some point the internals of pip were broken out into libraries and I assume that's what's being used elsewhere but I don't know much more than that.

adriangb avatar May 27 '24 14:05 adriangb

@dimbleby if I wanted to start working on this, could you give me some pointers as to where to begin? I tried looking at how install does things but it's a lot of code. Some more general guidance around what to replace pip with would be helpful; I know at some point the internals of pip were broken out into libraries and I assume that's what's being used elsewhere but I don't know much more than that.

I have started some PoC work on this by taking out the uninstaller from pip (there is no library separate for that yet) and putting it here. Didn't have much time to come back to it recently, but it might be a good start. Although I am still thinking that it could be a separate library one day.

Secrus avatar May 27 '24 14:05 Secrus

I guess conceptually the answer is: look at the package dist-info directory in the venv, and remove the items named in the RECORD file. Though perhaps there are edge cases where things are more complicated than that eg packages installed from git maybe? not sure

Practically I expect that Secrus is on the right lines in looking at how other projects do their uninstalling. Others that I might consider taking ideas from would include pdm - here I think - and uv - here, I suppose

I agree wibni this were pulled out into a nice library that we all could use

dimbleby avatar May 27 '24 15:05 dimbleby

Thank you folks. @Secrus do you plan on continuing work?

For context I'd like to use this for better caching in a monorepo setup. The idea is that you build a base or builder image with all of the deps and cache that (and you can tag it with the hash of poetry.lock or similar), then for each service you copy over the .venv and do a sync to minimize image size. At some point I'd like to present this at a conference or group.

adriangb avatar May 27 '24 18:05 adriangb

@adriangb Yes, but can't give any real timeline right now.

Secrus avatar May 27 '24 19:05 Secrus