poetry
poetry copied to clipboard
Poetry is extremely slow when resolving the dependencies
- [x] I am on the latest Poetry version.
- [x] I have searched the issues of this repo and believe that this is not a duplicate.
- [x] If an exception occurs when executing a command, I executed it again in debug mode (
-vvv
option).
- OS version and name: Centos 7
- Poetry version: 1.0.0
- Link of a Gist with the contents of your pyproject.toml file: https://gist.github.com/qiuwei/a0c7eee89e5e8d75edb477858213c30b
Issue
I created an empty project and run poetry add allennlp. It takes ages to resolve the dependencies.
Could this be due to downloading packages from pypi to inspect their dependencies, when not properly specified?
Could this be due to downloading packages from pypi to inspect their dependencies, when not properly specified?
It seems so. I have checked the detailed log, poetry kept retrying to resolve the dependency for botocore, but without success. So I assume that the dependency can be eventually resolved if enough time is given.
However, is there any way to get around this?
BTW, I also consider it's better to give some warning if there are some dependencies are not properly specified and could not be resolved after a number of attempts.
Hi,
I encounter a similiar problem on my MacOS. Python version used is 3.7.6, Poetry is 1.0.5. I just created a new project with no dependencies so far in pyproject.toml, just initially pytest. It takes ages until the new virtualenv is setup with all 11 packages installed.
Running it with -vvv does not bring any new findings.
Regards, Thomas
Yes, i'm running into the same problem. Resolving dependencies takes forever. I tried to use VPN to get through the GFW, nevertheless, it is still not working. I also tried to change pip source and wrote local source in the toml file, neither works. It's driving me nuts.
same here...😱
Same here. I just created an empty project then ran poetry install
and it takes so much time to resolve dependencies.
I'm currently using this workaround:
poetry export -f requirements.txt > requirements.txt
python -m pip install -r requirements.txt
poetry install
It takes a lower time space to install the package locally since all deps are already installed.
Make sure to run poetry shell
before to access the created virtual environment and install on it instead of on user/global path.
Poetry being slow to resolve dependencies seems to be a reoccuring issue:
- #476 - Poetry resolving dependencies is amazingly slow
- #819 - Resolving dependencies are slow
- #832 - Poetry update never finishes resolve and Poetry show --outdated hangs
- #1047 - Poetry doesn't resolve on MacOs Mojave
Maybe there is a dependency conflict.
No conflict. Poetry is slow as hell.
First of all, I want to say there is ongoing work to improve the dependency resolution.
However, there is so much Poetry can do with the current state of the Python ecosystem. I invite you to read https://python-poetry.org/docs/faq/#why-is-the-dependency-resolution-process-slow to know a little more about why the dependency resolution can be slow.
If you report that Poetry is slow, we would appreciate a pyproject.toml
that reproduces the issue so we can debug what's going on and if it's on Poetry's end or just the expected behavior.
@gagarine Could you provide the pyproject.toml
file you are using?
Take about 2min to resolve dependencies after adding newspaper3k on a fresh project. Connexion: 40ms ping and 10Mb/s down.
pyproject.toml
[tool.poetry]
name = "datafox"
version = "0.1.0"
description = ""
authors = ["Me <[email protected]>"]
[tool.poetry.dependencies]
python = "^3.8"
newspaper3k = "^0.2.8"
[tool.poetry.dev-dependencies]
pytest = "^5.2"
[build-system]
requires = ["poetry>=0.12"]
build-backend = "poetry.masonry.api"
Hey dudes - as Sebastian implied, the root cause is the Python eco's inconsistent/incomplete way of specifying dependencies and package metadata. Unfortunately, the Pypi team is treating this as a wont fix
.
In particular, using the Pypi json endpoint, an empty dep list could either mean "no dependencies", or "dependencies not specified". The Pypi team doesn't want to differentiate between these two cases for reasoning I don't follow.
The soln is to workaround this by maintaining a sep cache from Pypi that properly handles this distinction, and perhaps refuse to use packages that don't properly specify deps. However, this latter aspect may be tough, due to long dep nests.
Python's grown a lot over the decades, and it has much remaining from its early days. There's a culture of no-breaking-changes at any cost.
Having to run arbitrary python code to find dependencies is fucked, but .... we can do this for each noncompliant package, and save it.
First, it's capitalized PyPI.
Second, there is no way for PyPI to know dependencies for all packages without executing arbitrary code -- which is difficult to do safely and expensive (computationally and financially). PyPI is run on donated infrastructure from sponsors, maintained by volunteers and does not have millions of dollars of funding like many other language ecosystems' package indexes.
For anyone interested in further reading, here's an article written by a PyPI admin on this topic: https://dustingram.com/articles/2018/03/05/why-pypi-doesnt-know-dependencies/
It's not as tough as you imply.
You accept some risk by running the arbitrary code, but accepting things as they are isn't the right approach. We're already forcing this on anyone who installs Python packages; it's what triggers the delays cited in this thread.
I have the above repo running on a $10/month Heroku plan, and it works well.
I've made the assumption that if dependencies are specified, they're specified correctly, so only check the ones that show as having no deps. This won't work every time, but does in a large majority of cases.
Related: Projects like Poetry are already taking a swing at preventing this in the future: Specifying deps in pyproject.toml
, Pipfile
etc.
A personal Heroku app is not going to be as valuable a target as PyPI would be. Neither is a $10/month Heroku app going to be able to support the millions of API requests that PyPI gets everyday. The problem isn't in writing a script run a setup.py file in a sandbox, but in the logistics and challenges of providing it for the entire ecosystem.
"It works 90% of the time" is not an approach that can be taken by the canonical package index (which has to be used by everyone) but can be taken by specific tools (which users opt into using). Similar to how poetry
can use an AST parser for setup.py files which works >90% of the time, to avoid the overhead of a subprocess call, but pip shouldn't.
Anyway, I wanted to call out that "just blame PyPI folks because they don't care/are lazy" is straight up wrong IMO -- there are reasons that things are the way they are. That doesn't mean we shouldn't improve them, but it's important to understand why we're where we are. I'm going to step away now.
Before you step away - Can you think of a reason PyPi shouldn't differentiate between no dependencies, and missing dependency data?
If going through existing releases is too bold, what about for new ones?
I'm new to (more serious) python and don't understand the big drama. Yet setup.py
seems a powerful and very bad idea. Dependency management is terrible in python, because of setup.py
?
Can someone post a couple of examples where a txt file is not enough andsetup.py
was absolutely necessary?
"is a feature that has enabled better compatibility across an increasingly broad spectrum of platforms."
Cargo do it like this: https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#platform-specific-dependencies this is not enough for python?
Why poetry does not create their own package repository, avoiding setup.py and using their own dependency declaration? Could take time... but a bot can automatise the pull request on most python module based on the kind of technics used in https://github.com/David-OConnor/pydeps
I think the root cause is Python's been around for a while, and tries to maintain backwards compatibility. I agree - setup.py
isn't an elegant way to do things, and a file that declares dependencies and metadata is a better system. The wheel format causes dependencies to be specified in a MANIFEST
file, but there are still many older packages that don't use this format.
As a new lang, Rust
benefited by learning from the successes and failures of existing ones. Ie it has nice tools like Cargo, docs, clippy, fmt etc. It's possible to to implement tools / defaults like this for Python, but involves a big change, and potentially backwards-incompatibility. There are equivalents for many of these (pyproject.toml
, black
etc), but they're not officially supported or widely-adopted. Look at how long it took the Python 3 to be widely adopted for a taste of the challenge.
Can someone post a couple of examples where a txt file is not enough andsetup.py was absolutely necessary?
Not absolutely necessary, but helpful in the following scenario:
- A package has extra A and B
- Extra B needs extra A
With setup.py, you can follow the DRY principle:
requires_a = ('some', 'thing')
requires_b = requires_a + ('foo', 'bar')
For requirements.txt, I'm on the one hand not sure how you denote extras at all and even if you can, you would need to repeat the requirements of a within the requirements of b. This is prone to human error.
However, while creating the package, the package builder could output a textfile having those requirements.
Why poetry does not create their own package repository
You mean replacing PyPI? Good luck with that. I analyzed the packages on PyPI in January (PyPI Analysis 2020):
- 208,492 packages in total
- 2,957 had a pyproject.toml
- 1,511 specified poetry as a tool
I also gave a course about packaging in Python this year to PhD students. They simply want to share there work to a broad audience. I only mentioned poetry briefly because it is such a niche right now.
Changing a big, working system is hard. It took Python 2 -> 3 about 12 years and it is still not completely finished.
Hi,
I would like to invite everone interested in how depedencies should be declared to this discussion on python.org
fin swimmer
@finswimmer I check the discussion. Seem like they are reinventing the wheel instead of copy/past something that works (composer, Cargo, ...).
For requirements.txt, I'm on the one hand not sure how you denote extras at all and even if you can, you would need to repeat the requirements of a within the requirements of b. This is prone to human error.
For sure requirements.txt is not good.
You mean replacing PyPI? Good luck with that.
Yes. But why making poetry if it's not to replace PyPI and requirements.txt?
If poetry is compatible with PyPI, there is no incentive to add a pyproject.toml. Perhaps I don't even know I should add one. Now if every time I try to install a package that has no pyproject.toml the command line proposes me to open an issue on this project with a ready to use a template, this could speed things up.
Can you think of a reason PyPi shouldn't differentiate between no dependencies, and missing dependency data?
It'd be more productive to file an issue on https://github.com/pypa/warehouse, to ask this. There's either a good reason, or PyPI would be open to adding this functionality. In the latter case, depending on how the details work out, it might need to be standardized like pyproject.toml was before poetry adopted it, so that the entire ecosystem can depend on and utilize it.
Yes. But why making poetry if it's not to replace PyPI and requirements.txt?
You seem to confuse multiple parts of the ecosystem. I would distinguish those entities:
- The software which people want to share
- Software Repository: The platform on which people want to share it (e.g. PyPI)
- Package Format: The format in which they want to share it (e.g. wheels)
- Package Builder: The software people want to use to build the package (setuptools / poetry)
- Package Uploader: The software people want to use to upload it (twine / poetry)
- Package Manager: The software people want to use to install the package (pip / poetry) and its dependencies.
- Environment Manager: The software people use to encapsulate (pipenv / poetry)
Under the hood, I think, poetry uses a couple of those base tools. It is just meant to show a more consistent interface to the user.
I realise that now as I mention in #2338 . I'm therefor not that interested in poetry at the moment. I taught it was like composer and https://packagist.org, but it looks mostly like a wrapper around differents legacy tools.
[poetry] looks mostly like a wrapper around differents legacy tools
That is not the case. All tools I've mentioned are wide-spread, used by a majority of the Python developers and under active development. Yes, some of the tools are old - pip, for example, is 9 years old. Old is not the same as legacy. The hammer is an old tool. And still people use it. Why? Because it does the job it was designed for.
I don't know PHP well enough to be sure, but I think packagist.org is for PHP what pypi.org is for Python. Composer seems to be a package manager and thus comparable to pip. As composer also supports dependency management during project development, it fills a similar niche as poetry does.
I figured I would add more to this issue. It's taking more than 20 minutes for me:
gcoakes@workstation ~/s/sys-expect (master) [1]> time poetry add --dev 'pytest-asyncio'
The currently activated Python version 3.7.7 is not supported by the project (^3.8).
Trying to find and use a compatible version.
Using python3.8 (3.8.2)
Using version ^0.12.0 for pytest-asyncio
Updating dependencies
Resolving dependencies... (655.1s)
Writing lock file
Package operations: 1 install, 0 updates, 0 removals
- Installing pytest-asyncio (0.12.0)
________________________________________________________
Executed in 20.98 mins fish external
usr time 4.96 secs 0.00 micros 4.96 secs
sys time 0.35 secs 560.00 micros 0.35 secs
This is the pyproject.toml:
[tool.poetry]
name = "sys-expect"
version = "0.1.0"
description = ""
readme = "README.md"
include = [
"sys_expect/**/*.html",
"sys_expect/**/*.js",
]
[tool.poetry.dependencies]
python = "^3.8"
pyyaml = "^5.3.1"
serde = "^0.8.0"
aiohttp = "^3.6.2"
async_lru = "^1.0.2"
astunparse = "^1.6.3"
coloredlogs = "^14.0"
aiofiles = "^0.5.0"
[tool.poetry.dev-dependencies]
pytest = "^5.4"
black = "^19.10b0"
isort = { version = "^4.3.21", extras = ["pyproject"] }
flakehell = "^0.3.3"
flake8-bugbear = "^20.1"
flake8-mypy = "^17.8"
flake8-builtins = "^1.5"
coverage = "^5.1"
pytest-asyncio = "^0.12.0"
[tool.poetry.scripts]
sys-expect = 'sys_expect.cli:run'
[tool.isort]
multi_line_output = 3
include_trailing_comma = true
force_grid_wrap = 0
use_parentheses = true
line_length = 88
[tool.flakehell.plugins]
pyflakes = ["+*"]
flake8-bugbear = ["+*"]
flake8-mypy = ["+*"]
flake8-builtins = ["+*"]
[build-system]
requires = ["poetry>=0.12"]
build-backend = "poetry.masonry.api"
Question: What exactly does poetry
does extra here, that makes it way much slower than pip
's dependency resolution?
Does it actually put a lot of extra effort to figure out dependencies in a lot of situations that pip
doesn't?
Edit: It doesn't
As it is now, pip doesn’t have true dependency resolution, but instead simply uses the first specification it finds for a project.
https://pip.pypa.io/en/stable/user_guide/#requirements-files
Pip doesn't have dependency resolution.