meson-python
meson-python copied to clipboard
ENH: add support for editable wheels
~~This is just an initial approach, that is known to not work on all use-cases. It simply uses .pth files to inject search locations for the default import finder, for this reason generated files that should be installed inside modules will not be available.~~
Signed-off-by: Filipe Laíns [email protected]
This is just an initial implementation, it will likely not be merged.
This new implementation uses import hooks, which is what I wanted to do initially. Since the complexity required to make the old approach work turned out to be a bit big, it makes more sense to do it like this from the start.
The idea is that we built on top of the Traversable protocol (EditablePath) and only trigger a rebuild when the data is accessed, this will delay rebuilds until the very last moment and prevent unnecessary rebuilds by just importing the module.
This is an interesting approach. Although I wonder how ergonomic it is. The main use of editable installations is during development. During development with compiled languages (which I think is the main target of meson-python) it is not unusual to do a recompilation just to check that the written code at least compiles. How would this work with approach? Also, does this still uses ephemeral build directories as the regular builds or does it default to a build directory inside the project folder? An example usage scenario would be great to have. Thank you!
During development with compiled languages (which I think is the main target of meson-python)
I'd say it is "Python projects containing compiled languages" - typically the majority of contributors (at least for large projects like NumPy, SciPy, Pandas) touch docs, tests, and pure Python code only, or at least way more often than native code.
I agree that when you're touching C/C++/Cython/Fortran, editable builds are less relevant. But I've had regular requests from collaborators for editable installs, in particular from those who work with IDEs.
The way we've set it up for SciPy now to work with Meson is that running the test suite always first rebuilds (necessary to ensure the out-of-place install is not out of date). This takes a fraction of a second, so avoiding rebuilds on imports isn't essential I'd say. (not sure it matters regarding code complexity here).
The idea is that we built on top of the Traversable protocol (
EditablePath) and only trigger a rebuild when the data is accessed, this will delay rebuilds until the very last moment and prevent unnecessary rebuilds by just importing the module.
The Python stdlib docs are extremely sparse as usual. I have to admit I don't immediately grasp how this all works, a sketch of what actually happens with this implementation when one runs pip install -e . and then import pkg would be useful.
I agree that when you're touching C/C++/Cython/Fortran, editable builds are less relevant.
I don't think that's the case. I do a lot of cross-project development where I need to test the changes in one project within another one. In this case editable installations are useful.
Wow, this is pretty cool. Thanks for putting up the PR @FFY00. I just have a couple quick clarifying questions.
For this to work, would every project would need to inject this hook(MesonpyFinder) into sys.meta_path in their init.py?
After discussing with @rgommers, we thought it would be better to simplify the initial approach. Instead of creating our own loader and reader, we will just intercept the import call with a custom finder, which won't change any of the import machinery and will simply trigger a rebuild + install to a dummy directory with we will make sure we add to sys.path. This means that the module would still be imported via the normal finder and loader. I think this will be easier to understand after you see the code.
For this to work, would every project would need to inject this hook(MesonpyFinder) into sys.meta_path in their init.py?
Nop! Our editable wheel will install a .pth file that will do that. This code is still missing from the PR.
Also, the custom finder will need to be in sys.meta_path before the module is imported, as its goal is to hook into the import process, and the module __init__.py gets run on import, so that wouldn't work :stuck_out_tongue:
Alright, I haven't tested it yet, but it's getting pretty late, so I should go. The code should now be complete, but I expect stuff to be broken -- as I said, I didn't do any testing.
Alright, this should now be finally working :tada:
The only thing is that it currently does not detect top-level native modules.
@lithomas1 want to give it a try?
Alright, this should now be finally working 🎉
The only thing is that currently does not detect top-level native modules.
@lithomas1 want to give it a try?
Nice this is pretty awesome 🚀 (It is especially cool that you can just edit .pyx files and have the changes sync up too).
The only issue I have is slow rebuilds. I attached time.time() calls to measure the speed of rebuilds and currently its' taking ~3-4 seconds for a regular rebuild, and ~34 seconds(!!) when I modified a .pyx file.
It's not a deal-breaker for me, but its probably something to think about in the future.
How does that compare to just running ninja directly? It shouldn't be taking much longer than that, and this is something we can't really do much about.
Using Traversables we should be able to avoid the extra install step, but that shouldn't take that long anyway. We could also try to optimize the import hook code, but I feel that will be insignificant compared to the build step.
Perhaps we should have a way to disable swallowing the Meson output, so that developers can see the log when Python starts up?
Where is that time being spent? If ninja itself is taking that long, then which commands? Is that a cython problem or what?
How does that compare to just running
ninjadirectly? It shouldn't be taking much longer than that, and this is something we can't really do much about. UsingTraversables we should be able to avoid the extra install step, but that shouldn't take that long anyway. We could also try to optimize the import hook code, but I feel that will be insignificant compared to the build step.Perhaps we should have a way to disable swallowing the Meson output, so that developers can see the log when Python starts up?
Installs seem to be the bottleneck. It seems to take ~ 3 secs. We are installing a lot of files, though (.mesonpy/editable/install is around 68 MB).
I think disabling meson output is the right step, otherwise, someone might control-c thinking python is hanging, when we're actually rebuilding, which could potentially mess up future builds.
The tests failure on macOS are due to this https://github.com/mesonbuild/meson-python/blob/6845f905747030813314e78a752b62c35134d44c/mesonpy/_tags.py#L96-L111 Probably triggered by the recent release of a Python 3.7 minor version. It was meant to happen. I'll look into adjusting the tests. And I just realized the comment is not entirely correct either. I'll fix that too.
Something is seriously wrong here. I tried this with SciPy (on Arch Linux, in my regular scipy-dev conda env), and here is what happened:
- Ran
pip install --no-build-isolation -e ., which completed fine cd ../..and then openipythonimport scipy, which took maybe 2 seconds, then looked at some functions (all fine)- inspect
scipy.__file__, which points to/dir/to/repo/.mesonpy/editable/install/.../scipy/__init__.py(looks good, also on disk) - exit
ipython, make a one line edit in the repo toscipy/__init__.py - restart
ipythonand typeimport scipy- this crashed all open applications on my machine!
:astonished:
I was able to reproduce, my CPU and RAM get pinned to 100%. My best guess is that during the rebuild, a python interpreter gets spawned, which somehow triggers a new rebuild? SciPy is a complex build, something there might be importing a scipy module.
something there might be importing a
scipymodule.
Yes, I think that does happen. In NumPy as well. Probably in other projects too - it's fairly common, and tends to work fine (it's a bit hacky of course, but it's hard not to do this when you have for example codegen components in your package that you also need at build time).
Still missing the tests.
I have the missing test now. While adding it, I discovered an issue with the top_level_modules implementation, which should be fixed now.
Ah, how nice... tests are failing on Debian because the site-packages path is mangled. I need to test that to see what bit of the code exactly is throwing it off, I will do that monday.
Okay, I have now fixed all the last conflicts from the new release, this should be ready to go in.
I'm with @eli-schwartz and @rgommers here. We should strive for minimal dependencies. If we can make this work with an API available on older Python versions, I would go that way.
I am planning to submit a followup PR for that, but right now I just wanted to get this merged, so that it is easier to make other changes.
This needs docs before landing, though, so I am going to block this.
What exactly are you looking for in the docs? The next big item to work on is reworking the docs, as discussed in #197, so I avoided adding things just to then replace them right away, also I really wanted to get this out as soon as possible so that we can get people testing it.
If you have a minimum set o requirements for what needs to be documented, I can add that, but I'd consider full extensive documentation regarding how this feature works to be out of scope for the PR right now.
This needs docs before landing, though, so I am going to block this.
What exactly are you looking for in the docs? The next big item to work on is reworking the docs, as discussed in #197, so I avoided adding things just to then replace them right away, also I really wanted to get this out as soon as possible so that we can get people testing it.
Its hard to test something without having documentation. Also, I think this is basically going to get zero visibility without any docs, so we shouldn't rush on releasing this, IMO.
As to what I'm looking for, I am mainly looking for how to configure MesonpyFinder( e.g. how to use the verbose option, what does "_MESONPY_EDITABLE_SKIP" do). I already know the answers to these just by reading the code, but for anyone not already familiar with meson-python this is going to be challenging.
I don't have time to test this this week, so I'm just going to assume that if it works for scipy, its probably going to work for pandas.
Its hard to test something without having documentation. Also, I think this is basically going to get zero visibility without any docs, so we shouldn't rush on releasing this, IMO.
I added some simple instructions, which I think should be enough for now.
As to what I'm looking for, I am mainly looking for how to configure
MesonpyFinder( e.g. how to use the verbose option, what does "_MESONPY_EDITABLE_SKIP" do). I already know the answers to these just by reading the code, but for anyone not already familiar with meson-python this is going to be challenging.
I mentioned the verbose option in the docs.
_MESONPY_EDITABLE_SKIP is private, it should not be used by users. We need it to use it internally to prevent rebuild loops.
One issue I just ran into is that I did an editable build of SciPy (which failed), then a git clean -xdf, then a non-editable build, and the regular build is now broken because scipy-editable-hook.pth got left behind. Questions:
- Should the
.pthfile only be installed right at the end, once we're sure that the build has succeeded? - Does
pip uninstall scipyclean up the.pthfile? (I understandgit clean -xdfwon't) - For
git clean -xdftype things, is it possible to give a very clear error message when the.pthfile gets left behind? So the user at least gets told to manually delete that file to get things working again.