pex icon indicating copy to clipboard operation
pex copied to clipboard

Feature Idea: Support generating a single merged pex from a PEX_PATH

Open jsirois opened this issue 7 years ago • 5 comments

IE, support something like:

  • When creating a pex: pex --pex-path=a.pex:b.pex --merged-output-file=merged.pex
  • Using an existing pex that contains a PexInfo.pex_path: PEX_MERGED_OUTPUT_FILE=mymergedpex ./mypex

Here --merged-output-file is used as opposed to -o or --output-file and PEX_MERGED_OUTPUT_FILE is the env variable form.

jsirois avatar Mar 14 '18 20:03 jsirois

Would probably be a good idea to note in the help for -o that it does not load the PEX_PATH if/when this is added.

cosmicexplorer avatar Mar 14 '18 21:03 cosmicexplorer

How exactly to word all this is a bit fraught for sure. Current relevant help:

pex -h
...
Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -o PEX_NAME, --output-file=PEX_NAME
                        The name of the generated .pex file: Omiting this will
                        run PEX immediately and not save it to a file.
...
Resolver options:
    Tailor how to find, resolve and translate the packages that get put
    into the PEX environment.

    --pypi, --no-pypi, --no-index
                        Whether to use pypi to resolve dependencies; Default:
                        use pypi
    --pex-path=PEX_PATH
                        A colon separated list of other pex files to merge
                        into the runtime environment.
...

A complicating phrase is PEX environment here and a complicating concept is that --pex-path is under the Resolver options heading - which in itself is perhaps fine, but confusing when combined with the knowledge that non-pex requirements are resolved into the built pex file - unlike PEX_PATH requirements which are only ever adjoined to the runtime sys.path.

jsirois avatar Mar 14 '18 21:03 jsirois

what's the use case? just incremental binary goal build perf for pants?

it's worth noting that PEX_PATH remains a "best effort" thing given the proclivity for conflicts between various pex files 3rdparty requirements. IIRC, we effectively just bootstrap each pex in order and layer on the sys.path mutations. this seems fine for controlled runtime cases, but for sealed binary builds maybe less so?

ultimately, it's not clear to me how to best combine 1..N pex files into a net new pex without running another top level resolve against the new transitive closure of deps - which I thought was most of the runtime cost for the incremental build case.

kwlzn avatar Mar 21 '18 02:03 kwlzn

what's the use case? just incremental binary goal build perf for pants?

Yeah, I was looking into pex resolve issues for an internal pants-plugin I was playing with (which no longer exists) and got sidetracked and learned some more about how pants does python.

ultimately, it's not clear to me how to best combine 1..N pex files into a net new pex without running another top level resolve against the new transitive closure of deps - which I thought was most of the runtime cost for the incremental build case.

I actually wasn't quite clear on the mechanism being described here -- I think I can totally see how just extending the PEX_PATH could make that resolution process much longer (especially if it's already most of the runtime cost of incremental build, that's a fun fact).

it's worth noting that PEX_PATH remains a "best effort" thing given the proclivity for conflicts between various pex files 3rdparty requirements. IIRC, we effectively just bootstrap each pex in order and layer on the sys.path mutations. this seems fine for controlled runtime cases, but for sealed binary builds maybe less so?

I was originally thinking of "just" doing some of that "resolution process" at pex build time if possible -- but if the "resolution process" is just layering on sys.path mutations (I hadn't dived into the code yet and it's unclear what I assumed was happening instead), I definitely can't immediately see how to do that runtime resolve process at build time like I was thinking originally.

cosmicexplorer avatar Mar 21 '18 02:03 cosmicexplorer

pantsbuild/pants#9516 and related work may allow us to close this issue soon in favor of allowing pants to do the job of merging PEX files, instead of baking that into PEX itself. Super exciting!

cosmicexplorer avatar Apr 11 '20 06:04 cosmicexplorer

I reject this feature idea for all the reasons mentioned by @kwlzn. In essence, PEX_PATH is an incredibly sharp-edged hack. It doesn't need its edge honed further.

jsirois avatar Sep 28 '24 18:09 jsirois