pex icon indicating copy to clipboard operation
pex copied to clipboard

Move from pkg_resources to importlib.metadata

Open jsirois opened this issue 5 years ago • 4 comments

As pointed out by @Eric-Arellano, the stdlib now supports .dist-info distribution metadata which contains the bits PEX needs at build-time and runtime and currently gets via pkg_resources.

We will continue to need to vendor setuptools since our vendored pip needs this to build legacy dists (most of the universe of dists). We'd need to additionally vendor a backport in the pex distribution and into the .bootstrap for pexes that support 2.7 at runtime so this isn't a clean win until pex drops support for 2.7.

See:

  • https://docs.python.org/3/library/importlib.metadata.html
  • https://pypi.org/project/importlib-metadata/

jsirois avatar Jan 05 '20 19:01 jsirois

Any update on this?

ofek avatar Apr 03 '22 00:04 ofek

Pex still supports Python 2.7 users, so no. What is your particular interest @ofek? This should mostly be an implementation detail with ~no impact to Pex end users but I'd be interested to know how it does impact your use of Pex.

jsirois avatar Apr 03 '22 04:04 jsirois

I am mostly curious if https://shiv.readthedocs.io/en/latest/history.html is still accurate regarding performance

ofek avatar Apr 03 '22 17:04 ofek

Not so much. Pex has supported a --venv mode for over a year now that makes Pex not as fast, but almost as fast as Shiv, providing the same short sys.path / standard venv layout and lack of use of pkg_resources (after the 1st run) as shiv.

For example:

Using latest of both tools:

$ shiv --version
shiv, version 1.0.1
$ pex -V
2.1.75

Building a moderately large zipapp with cold caches:

$ rm -rf ~/.cache/pip/ ~/.shiv && time shiv jupyter -c jupyter -o jupyter.shiv -q

real	0m18.742s
user	0m8.510s
sys	0m0.467s

$ rm -rf ~/.cache/pip ~/.pex && time pex jupyter -c jupyter -o jupyter.pex --venv

real	0m19.655s
user	0m31.852s
sys	0m2.575s

$ ls -l jupyter.*
-rwxr-xr-x 1 jsirois jsirois 25779520 Apr  4 08:50 jupyter.pex
-rwxr-xr-x 1 jsirois jsirois 24340007 Apr  4 08:51 jupyter.shiv

# 65 contained distributions:
$ pex-tools jupyter.pex info | jq -r '.distributions | keys[]' | wc -l
65

1st run with cold caches on machine the shiv and PEX are shipped to:

$ rm -rf ~/.pex && time ./jupyter.pex --version
/home/jsirois/.pex/unzipped_pexes/cda2816363d1b0f40a6ce9be8d4ace5571b117cd/.bootstrap/pex/venv/pex.py:160: PEXWarning: Encountered collision building venv at /home/jsirois/.pex/venvs/s/a1e76ad1/venv from /home/jsirois/.pex/unzipped_pexes/cda2816363d1b0f40a6ce9be8d4ace5571b117cd:
1. /home/jsirois/.pex/venvs/cda2816363d1b0f40a6ce9be8d4ace5571b117cd/0c2af63c3815d1d03077ee9c1f2cbc64e6c7925d.f184664b904c42fea049ba765f5de90f/lib/python3.10/site-packages/jupyter.py was provided by:
	sha1:a51894839c7f787ae0683aaf42df53e4e301cb6b -> /home/jsirois/.pex/installed_wheels/f92a7b77b12537720b7a96bda9556dab84b0d5bc82023371a1abcd0dd292d002/jupyter-1.0.0-py2.py3-none-any.whl/jupyter.py
	sha1:c794757a5bb0726e1534369c212507214a826c48 -> /home/jsirois/.pex/installed_wheels/74d27484bb6a5dc9ae118bbbd2cc6a34ad7a6c2f12f6fb4154f8dd272ec4ffa5/jupyter_core-4.9.2-py3-none-any.whl/jupyter.py
  pex_warnings.warn(message)
Selected Jupyter core packages...
IPython          : 8.2.0
ipykernel        : 6.12.1
ipywidgets       : 7.7.0
jupyter_client   : 7.2.1
jupyter_core     : 4.9.2
jupyter_server   : not installed
jupyterlab       : not installed
nbclient         : 0.5.13
nbconvert        : 6.4.5
nbformat         : 5.3.0
notebook         : 6.4.10
qtconsole        : 5.3.0
traitlets        : 5.1.1

real	0m2.664s
user	0m2.395s
sys	0m0.255s

$ rm -rf ~/.shiv && time ./jupyter.shiv --version
Selected Jupyter core packages...
IPython          : 8.2.0
ipykernel        : 6.12.1
ipywidgets       : 7.7.0
jupyter_client   : 7.2.1
jupyter_core     : 4.9.2
jupyter_server   : not installed
jupyterlab       : not installed
nbclient         : 0.5.13
nbconvert        : 6.4.5
nbformat         : 5.3.0
notebook         : 6.4.10
qtconsole        : 5.3.0
traitlets        : 5.1.1

real	0m1.764s
user	0m1.577s
sys	0m0.167s

Subsequent runs on that remote machine:

$ hyperfine './jupyter.shiv --version' './jupyter.pex --version'
Benchmark 1: ./jupyter.shiv --version
  Time (mean ± σ):     423.7 ms ±   5.8 ms    [User: 392.9 ms, System: 31.1 ms]
  Range (min … max):   414.0 ms … 432.7 ms    10 runs
 
Benchmark 2: ./jupyter.pex --version
  Time (mean ± σ):     480.8 ms ±   9.0 ms    [User: 443.2 ms, System: 37.6 ms]
  Range (min … max):   470.2 ms … 503.5 ms    10 runs
 
Summary
  './jupyter.shiv --version' ran
    1.13 ± 0.03 times faster than './jupyter.pex --version'

So PEXes are ~10% slower now when run via the PEX file for runs 2+. You can improve this with a 1-time install step on the remote machine though:

$ PEX_TOOLS=1 ./jupyter.pex venv /tmp/install/my/pex/as/a/venv/here --collisions-ok
/home/jsirois/.pex/unzipped_pexes/cda2816363d1b0f40a6ce9be8d4ace5571b117cd/.bootstrap/pex/venv/pex.py:160: PEXWarning: Encountered collision building venv at /tmp/install/my/pex/as/a/venv/here from /home/jsirois/dev/pantsbuild/pex/./jupyter.pex:
1. /tmp/install/my/pex/as/a/venv/here/lib/python3.10/site-packages/jupyter.py was provided by:
	sha1:a51894839c7f787ae0683aaf42df53e4e301cb6b -> /home/jsirois/.pex/installed_wheels/f92a7b77b12537720b7a96bda9556dab84b0d5bc82023371a1abcd0dd292d002/jupyter-1.0.0-py2.py3-none-any.whl/jupyter.py
	sha1:c794757a5bb0726e1534369c212507214a826c48 -> /home/jsirois/.pex/installed_wheels/74d27484bb6a5dc9ae118bbbd2cc6a34ad7a6c2f12f6fb4154f8dd272ec4ffa5/jupyter_core-4.9.2-py3-none-any.whl/jupyter.py
  pex_warnings.warn(message)

$ hyperfine './jupyter.shiv --version' '/tmp/install/my/pex/as/a/venv/here/pex --version'
Benchmark 1: ./jupyter.shiv --version
  Time (mean ± σ):     427.1 ms ±   5.7 ms    [User: 399.6 ms, System: 27.9 ms]
  Range (min … max):   416.1 ms … 436.4 ms    10 runs
 
Benchmark 2: /tmp/install/my/pex/as/a/venv/here/pex --version
  Time (mean ± σ):     355.1 ms ±   3.6 ms    [User: 332.1 ms, System: 22.1 ms]
  Range (min … max):   348.5 ms … 360.7 ms    10 runs
 
Summary
  '/tmp/install/my/pex/as/a/venv/here/pex --version' ran
    1.20 ± 0.02 times faster than './jupyter.shiv --version'

So, depending on your needs (Shiv is almost certainly the better choice for AWS Lambdas which run once), Pex may be more compelling now. Keep in mind it keeps its traditional features with these new performance gains & tools; namely:

  • It supports multi-interpreter / multi-platform PEX files (1 file that works with Python 3.7-3.10 and on macOS & Linux - for example).
  • It strictly isolates your PEXed app from the environment. There will be 0 extra scripts or distributions installed in or visible to the PEX --venv unless you use other Pex features to let these leak in.

jsirois avatar Apr 04 '22 16:04 jsirois

Pex stopped using pkg_resources in #1768. It still vendors setuptools for reasons mentioned above, but it never imports from setuptools and only imports from pkg_resources when a user-specified dependency in a PEX file requires pkg_resources but did not declare the dependency. In that case the PEX imports its vendored pkg_resources for the user on their behalf.

jsirois avatar Aug 14 '24 20:08 jsirois

Nice!

ofek avatar Aug 14 '24 20:08 ofek