pex
pex copied to clipboard
Avoid Resolving Dependencies When Building in Intransitive Mode
I am building huge PEX fixes with more than 100 dependencies. The process takes around 30 minutes. It seems like most of the time is spent resolving dependencies.
Even when building with --intransitive
and providing all necessary dependencies with --requirements
, a lot of the time seems to be spent resolving these.
How can we improve this situation? With a few pointers I can contribute to this.
This is an example of what I am running:
service-bin: project-image
@./bin/dockerized \
mkdir -p dist \&\& \
poetry export --format requirements.txt --without-hashes --output dist/requirements.txt --extras service \&\& \
poetry build --format wheel \&\& \
pex \
-v \
--intransitive \
--no-pypi \
--index "https://${PYPI_USER}:${PYPI_TOKEN}@${PYPI_HOST}" \
--requirement dist/requirements.txt \
--entry-point ${PACKAGE_NAME}.service.main \
--output-file dist/server.pex \
--pex-root /root/.cache/pex \
"dist/${PACKAGE_NAME_VERSIONED}.whl"
The exported requirements.txt
includes transitive dependencies with fully defined versions.
Thanks for filing this @adeandrade.
To clarify the terminology, typically we talk of two different files, requirements.txt
, which contains the requirements your code directly depends on, and then the optional constraints.txt
(sometimes referred to as a lockfile) which contains not just those requirements but also all their transitive requirements.
Requirements may be loose (e.g., foo>=2.5.1), but constraints must be pinned (e.g., foo==2.5.1).
Pip uses the constraints file to pick versions of dependencies as it resolves the requirements (so if you specify constraints you must also specify the underlying direct requirements).
Requirement files are passed to pex using --requirement
and constraint files are passed using --constraints
. These correspond to the pip --requirement
and --constraint
flags (the extra 's' at the end of pex's --constraints
flag is an unfortunate slip-up).
It sounds like your requirements.txt
is actually a constraints file? Is it comprehensive, and are all requirements in it pinned to a single version? Could you post it here if it's not secret? Or a redacted version at least?
then the optional constraints.txt (sometimes referred to as a lockfile) which contains not just those requirements but also all their transitive requirements.
Nuance: they can contain all the transitive requirements, but need not. A constraints.txt file could technically have only one entry, for example, which means everything else is unconstrainted.
This nuance is important. It impacts whether --intransitive
will work or not.
but constraints must be pinned
This isn't true. Constraints can be any value normally in requirements.txt
, e.g. >=3.5
. All the constraints file does is substitute the requirement string normally used with the constraint value.
Thank you @benjyw and @Eric-Arellano for the responses. I guess my requirements.txt
file is also a constraints file since it is derived from a lock file. All dependencies are expressed with equalities.
By specifying --constraints
(as well as --requirement
, since I still have the --intransitive
flag) building time was reduced by 25%. That's good news.
I still see some resolving going on in the logs though. It still takes more than 10 mins on my project (when dependencies are cached). Can we do better?
I guess it depends what exactly is happening in the time attributed to resolving. For example, downloading the dists can take time, and if they are sdists then pip has to run setup()
on them, and that can take a while - for example, in some cases a lot of native code compilation has to happen. These results can be cached by pip, but it's possible that cache isn't being preserved across runs. E.g., CI machines typically present you with a clean container on every run, and you have to configure specific directories to be saved and restored between runs.
Can you post some snippets of those logs? And are you seeing this phenomenon on developer laptops, or CI machines, or both?
This issue is related to #1086, which discusses the differences between a constraints file and a lockfile as well. Consuming an unchanged lockfile implementation would allow for a zero-resolve "fetch and validate the fingerprint of these precise wheels" step to reproduce the output of the resolve without running it.
The focus here has been on resolving, but I think the ~OP shows that's a detour around the block: https://github.com/pex-tool/pex/issues/1093#issuecomment-720192890
IIUC @adeandrade just wants to make a PEX from their already working Poetry setup: roughly - turn a Poetry venv into a PEX. If I'm right there, that totally sidesteps any normal concept of resolving, Pex just needs to be able to slurp up a venv into a PEX file. That idea is tracked by #1361. It's been far too long, but if you are able to chime in on this assessment @adeandrade, I'd be grateful.