meson-python icon indicating copy to clipboard operation
meson-python copied to clipboard

ENH: add support for editable wheels

Open FFY00 opened this issue 3 years ago • 23 comments

~~This is just an initial approach, that is known to not work on all use-cases. It simply uses .pth files to inject search locations for the default import finder, for this reason generated files that should be installed inside modules will not be available.~~

Signed-off-by: Filipe Laíns [email protected]

FFY00 avatar Jun 22 '22 17:06 FFY00

This is just an initial implementation, it will likely not be merged.

FFY00 avatar Jun 22 '22 17:06 FFY00

This new implementation uses import hooks, which is what I wanted to do initially. Since the complexity required to make the old approach work turned out to be a bit big, it makes more sense to do it like this from the start.

The idea is that we built on top of the Traversable protocol (EditablePath) and only trigger a rebuild when the data is accessed, this will delay rebuilds until the very last moment and prevent unnecessary rebuilds by just importing the module.

FFY00 avatar Nov 25 '22 08:11 FFY00

This is an interesting approach. Although I wonder how ergonomic it is. The main use of editable installations is during development. During development with compiled languages (which I think is the main target of meson-python) it is not unusual to do a recompilation just to check that the written code at least compiles. How would this work with approach? Also, does this still uses ephemeral build directories as the regular builds or does it default to a build directory inside the project folder? An example usage scenario would be great to have. Thank you!

dnicolodi avatar Nov 25 '22 08:11 dnicolodi

During development with compiled languages (which I think is the main target of meson-python)

I'd say it is "Python projects containing compiled languages" - typically the majority of contributors (at least for large projects like NumPy, SciPy, Pandas) touch docs, tests, and pure Python code only, or at least way more often than native code.

I agree that when you're touching C/C++/Cython/Fortran, editable builds are less relevant. But I've had regular requests from collaborators for editable installs, in particular from those who work with IDEs.

rgommers avatar Nov 25 '22 09:11 rgommers

The way we've set it up for SciPy now to work with Meson is that running the test suite always first rebuilds (necessary to ensure the out-of-place install is not out of date). This takes a fraction of a second, so avoiding rebuilds on imports isn't essential I'd say. (not sure it matters regarding code complexity here).

rgommers avatar Nov 25 '22 09:11 rgommers

The idea is that we built on top of the Traversable protocol (EditablePath) and only trigger a rebuild when the data is accessed, this will delay rebuilds until the very last moment and prevent unnecessary rebuilds by just importing the module.

The Python stdlib docs are extremely sparse as usual. I have to admit I don't immediately grasp how this all works, a sketch of what actually happens with this implementation when one runs pip install -e . and then import pkg would be useful.

rgommers avatar Nov 25 '22 09:11 rgommers

I agree that when you're touching C/C++/Cython/Fortran, editable builds are less relevant.

I don't think that's the case. I do a lot of cross-project development where I need to test the changes in one project within another one. In this case editable installations are useful.

dnicolodi avatar Nov 25 '22 10:11 dnicolodi

Wow, this is pretty cool. Thanks for putting up the PR @FFY00. I just have a couple quick clarifying questions.

For this to work, would every project would need to inject this hook(MesonpyFinder) into sys.meta_path in their init.py?

lithomas1 avatar Nov 28 '22 16:11 lithomas1

After discussing with @rgommers, we thought it would be better to simplify the initial approach. Instead of creating our own loader and reader, we will just intercept the import call with a custom finder, which won't change any of the import machinery and will simply trigger a rebuild + install to a dummy directory with we will make sure we add to sys.path. This means that the module would still be imported via the normal finder and loader. I think this will be easier to understand after you see the code.

FFY00 avatar Nov 28 '22 19:11 FFY00

For this to work, would every project would need to inject this hook(MesonpyFinder) into sys.meta_path in their init.py?

Nop! Our editable wheel will install a .pth file that will do that. This code is still missing from the PR.

Also, the custom finder will need to be in sys.meta_path before the module is imported, as its goal is to hook into the import process, and the module __init__.py gets run on import, so that wouldn't work :stuck_out_tongue:

FFY00 avatar Nov 28 '22 19:11 FFY00

Alright, I haven't tested it yet, but it's getting pretty late, so I should go. The code should now be complete, but I expect stuff to be broken -- as I said, I didn't do any testing.

FFY00 avatar Nov 30 '22 03:11 FFY00

Alright, this should now be finally working :tada:

The only thing is that it currently does not detect top-level native modules.

@lithomas1 want to give it a try?

FFY00 avatar Nov 30 '22 19:11 FFY00

Alright, this should now be finally working 🎉

The only thing is that currently does not detect top-level native modules.

@lithomas1 want to give it a try?

Nice this is pretty awesome 🚀 (It is especially cool that you can just edit .pyx files and have the changes sync up too).

The only issue I have is slow rebuilds. I attached time.time() calls to measure the speed of rebuilds and currently its' taking ~3-4 seconds for a regular rebuild, and ~34 seconds(!!) when I modified a .pyx file.

It's not a deal-breaker for me, but its probably something to think about in the future.

lithomas1 avatar Nov 30 '22 21:11 lithomas1

How does that compare to just running ninja directly? It shouldn't be taking much longer than that, and this is something we can't really do much about. Using Traversables we should be able to avoid the extra install step, but that shouldn't take that long anyway. We could also try to optimize the import hook code, but I feel that will be insignificant compared to the build step.

Perhaps we should have a way to disable swallowing the Meson output, so that developers can see the log when Python starts up?

FFY00 avatar Nov 30 '22 22:11 FFY00

Where is that time being spent? If ninja itself is taking that long, then which commands? Is that a cython problem or what?

eli-schwartz avatar Nov 30 '22 22:11 eli-schwartz

How does that compare to just running ninja directly? It shouldn't be taking much longer than that, and this is something we can't really do much about. Using Traversables we should be able to avoid the extra install step, but that shouldn't take that long anyway. We could also try to optimize the import hook code, but I feel that will be insignificant compared to the build step.

Perhaps we should have a way to disable swallowing the Meson output, so that developers can see the log when Python starts up?

Installs seem to be the bottleneck. It seems to take ~ 3 secs. We are installing a lot of files, though (.mesonpy/editable/install is around 68 MB).

I think disabling meson output is the right step, otherwise, someone might control-c thinking python is hanging, when we're actually rebuilding, which could potentially mess up future builds.

lithomas1 avatar Nov 30 '22 23:11 lithomas1

The tests failure on macOS are due to this https://github.com/mesonbuild/meson-python/blob/6845f905747030813314e78a752b62c35134d44c/mesonpy/_tags.py#L96-L111 Probably triggered by the recent release of a Python 3.7 minor version. It was meant to happen. I'll look into adjusting the tests. And I just realized the comment is not entirely correct either. I'll fix that too.

dnicolodi avatar Dec 08 '22 10:12 dnicolodi

Something is seriously wrong here. I tried this with SciPy (on Arch Linux, in my regular scipy-dev conda env), and here is what happened:

  1. Ran pip install --no-build-isolation -e ., which completed fine
  2. cd ../.. and then open ipython
  3. import scipy, which took maybe 2 seconds, then looked at some functions (all fine)
  4. inspect scipy.__file__, which points to /dir/to/repo/.mesonpy/editable/install/.../scipy/__init__.py (looks good, also on disk)
  5. exit ipython, make a one line edit in the repo to scipy/__init__.py
  6. restart ipython and type import scipy - this crashed all open applications on my machine!

rgommers avatar Dec 08 '22 11:12 rgommers

:astonished:

I was able to reproduce, my CPU and RAM get pinned to 100%. My best guess is that during the rebuild, a python interpreter gets spawned, which somehow triggers a new rebuild? SciPy is a complex build, something there might be importing a scipy module.

FFY00 avatar Dec 08 '22 20:12 FFY00

something there might be importing a scipy module.

Yes, I think that does happen. In NumPy as well. Probably in other projects too - it's fairly common, and tends to work fine (it's a bit hacky of course, but it's hard not to do this when you have for example codegen components in your package that you also need at build time).

rgommers avatar Dec 08 '22 20:12 rgommers

Still missing the tests.

FFY00 avatar Dec 09 '22 01:12 FFY00

I have the missing test now. While adding it, I discovered an issue with the top_level_modules implementation, which should be fixed now.

FFY00 avatar Dec 10 '22 04:12 FFY00

Ah, how nice... tests are failing on Debian because the site-packages path is mangled. I need to test that to see what bit of the code exactly is throwing it off, I will do that monday.

FFY00 avatar Dec 10 '22 04:12 FFY00

Okay, I have now fixed all the last conflicts from the new release, this should be ready to go in.

FFY00 avatar Dec 22 '22 14:12 FFY00

I'm with @eli-schwartz and @rgommers here. We should strive for minimal dependencies. If we can make this work with an API available on older Python versions, I would go that way.

I am planning to submit a followup PR for that, but right now I just wanted to get this merged, so that it is easier to make other changes.

FFY00 avatar Dec 22 '22 15:12 FFY00

This needs docs before landing, though, so I am going to block this.

What exactly are you looking for in the docs? The next big item to work on is reworking the docs, as discussed in #197, so I avoided adding things just to then replace them right away, also I really wanted to get this out as soon as possible so that we can get people testing it.

If you have a minimum set o requirements for what needs to be documented, I can add that, but I'd consider full extensive documentation regarding how this feature works to be out of scope for the PR right now.

FFY00 avatar Dec 22 '22 16:12 FFY00

This needs docs before landing, though, so I am going to block this.

What exactly are you looking for in the docs? The next big item to work on is reworking the docs, as discussed in #197, so I avoided adding things just to then replace them right away, also I really wanted to get this out as soon as possible so that we can get people testing it.

Its hard to test something without having documentation. Also, I think this is basically going to get zero visibility without any docs, so we shouldn't rush on releasing this, IMO.

As to what I'm looking for, I am mainly looking for how to configure MesonpyFinder( e.g. how to use the verbose option, what does "_MESONPY_EDITABLE_SKIP" do). I already know the answers to these just by reading the code, but for anyone not already familiar with meson-python this is going to be challenging.

lithomas1 avatar Dec 22 '22 16:12 lithomas1

I don't have time to test this this week, so I'm just going to assume that if it works for scipy, its probably going to work for pandas.

lithomas1 avatar Dec 22 '22 16:12 lithomas1

Its hard to test something without having documentation. Also, I think this is basically going to get zero visibility without any docs, so we shouldn't rush on releasing this, IMO.

I added some simple instructions, which I think should be enough for now.

As to what I'm looking for, I am mainly looking for how to configure MesonpyFinder( e.g. how to use the verbose option, what does "_MESONPY_EDITABLE_SKIP" do). I already know the answers to these just by reading the code, but for anyone not already familiar with meson-python this is going to be challenging.

I mentioned the verbose option in the docs.

_MESONPY_EDITABLE_SKIP is private, it should not be used by users. We need it to use it internally to prevent rebuild loops.

FFY00 avatar Dec 22 '22 17:12 FFY00

One issue I just ran into is that I did an editable build of SciPy (which failed), then a git clean -xdf, then a non-editable build, and the regular build is now broken because scipy-editable-hook.pth got left behind. Questions:

  • Should the .pth file only be installed right at the end, once we're sure that the build has succeeded?
  • Does pip uninstall scipy clean up the .pth file? (I understand git clean -xdf won't)
  • For git clean -xdf type things, is it possible to give a very clear error message when the .pth file gets left behind? So the user at least gets told to manually delete that file to get things working again.

rgommers avatar Dec 22 '22 18:12 rgommers