pex icon indicating copy to clipboard operation
pex copied to clipboard

pip editable mode broken

Open fhoering opened this issue 5 years ago • 4 comments

Moving to flit and deleting setup.py in this commit https://github.com/pantsbuild/pex/commit/1c869f759a8202e45b54d98e0f3a242144be38d1 has broken the pip editable mode. I suppose it is by design but it is complicated to play around with PYTHONPATH when debugging issues on multiple consuming projects.

One detail that doesn't ease things is that it launches pip as as subprocess now and invokes the embedded vendor setuptools version with another PYTHONPATH set.

Any chance to bring back pip editable mode?

Any guidance on how to debug the whole thing now? (I spend one day debugging an issue with pex 2.0 where I invoke pex from a repo having a setup.py without really understanding the issue)

fhoering avatar Dec 23 '19 19:12 fhoering

The transition away from a legacy setup.py build was definitely by design. I've never used pip editable mode so I probably can't provide great guidance. I know when debugging PEX as used by pants, I'll often hack on the PEX sources in pant's venv.

jsirois avatar Dec 23 '19 21:12 jsirois

As an aside, having to use editable mode implies a tight API coupling. So far, PEX has generally assumed it's only close API consumer was Pants. I'd be interested to know how your project(s) consume PEX APIs.

jsirois avatar Dec 23 '19 21:12 jsirois

We isolated all the pex related stuff in a separate repo now https://github.com/criteo/cluster-pack This is still WiP for the moment.

We consume this with https://github.com/criteo/tf-yarn and pyspark (currently not open source).

I care about editable mode because this is really a cool feature in the context of uploading stuff to 100 nodes in the cluster. The basic idea is generate one pex file with all the dependencies, upload it to distributed storage and everything that is "developed" is uploaded all the time. Basically this just acts like caching but generating a 500 MB pex file takes some time. This also works (worked) with 3rd party dependencies like mlflow, skein, pex

Directly hacking the virtual environment is possible of course but having a clean solution would be nicer and also this is very difficult when the whole thing is shipped on the cluster and the pipeline is built around it. We already had issues where stuff was only reproducible on the cluster. Finding the issue quickly and not in days is very important here.

Most of the ideas are written here: https://medium.com/criteo-labs/packaging-code-with-pex-a-pyspark-example-9057f9f144f3

We also plan to use this to bring pex support to mlflow projects for example which currently only supports conda.

Currently all datascience stuff mostly uses conda but we think that python default tooling (pypi, wheels, self contained executables) is good enough to give the same features and the self contained executable zip idea is much better than what conda does where one has to zip the whole env manually, unzip on the cluster and then call the embedded interpreter on the cluster.

My first try on updating pex is here: https://github.com/fhoering/cluster-pack/tree/update_pex Need to add integration/unit tests and investigate more that there are no issues because as we use this in prod now we can't update easily anymore.

fhoering avatar Dec 27 '19 09:12 fhoering

It is sad nevertheless to see that many projects bring improvement on one side and other stuff stays broken https://github.com/pypa/pip/issues/6605

fhoering avatar Dec 27 '19 09:12 fhoering

This has been fixed by the passage of time and the resolution of the Pip issue you pointed to. Pex currently uses hatch as its build system, but that doesn't really matter with modern Pip:

:; mkdir /tmp/test
:; cd /tmp/test
:; python -mvenv .venv
:; source .venv/bin/activate

# Install modern Pip:
:; pip install -U pip
Requirement already satisfied: pip in ./.venv/lib/python3.11/site-packages (24.0)
Collecting pip
  Using cached pip-24.2-py3-none-any.whl.metadata (3.6 kB)
Using cached pip-24.2-py3-none-any.whl (1.8 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 24.0
    Uninstalling pip-24.0:
      Successfully uninstalled pip-24.0
Successfully installed pip-24.2

# Succeed with editable mode install of non-setuptools Pex project:
:; pip install -e ~/dev/pex-tool/pex
Obtaining file:///home/jsirois/dev/pex-tool/pex
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build editable ... done
  Installing backend dependencies ... done
  Preparing editable metadata (pyproject.toml) ... done
Building wheels for collected packages: pex
  Building editable for pex (pyproject.toml) ... done
  Created wheel for pex: filename=pex-2.19.0-py2.py3-none-any.whl size=8122 sha256=7563fd322e58f9a970b128ee7080dcab5e31d3ac49e2b5104e7d872d2f92445c
  Stored in directory: /tmp/pip-ephem-wheel-cache-2jt7pw9k/wheels/4f/db/c6/0aa36a0a84278af0eefc560c621961a202b407ddd5f69887c0
Successfully built pex
Installing collected packages: pex
Successfully installed pex-2.19.0

# Now edit the Pex sources:
:; vi ~/dev/pex-tool/pex/pex/version.py
:; GIT_WORK_TREE=~/dev/pex-tool/pex GIT_DIR=~/dev/pex-tool/pex/.git git diff
diff --git a/pex/version.py b/pex/version.py
index 95550c4e..f6712e6a 100644
--- a/pex/version.py
+++ b/pex/version.py
@@ -1,4 +1,4 @@
 # Copyright 2015 Pex project contributors.
 # Licensed under the Apache License, Version 2.0 (see LICENSE).

-__version__ = "2.19.0"
+__version__ = "2.19.0-hello!"

# And profit:
:; pex -V
2.19.0-hello!

jsirois avatar Sep 15 '24 06:09 jsirois