flux-core
flux-core copied to clipboard
Python: flux bindings with multiple Python versions
A few users, including @davidbloss, are interested in using the python bindings for Flux from their own Python installations, rather than the Python that Flux was built with.
From a long flux-discussion-internal email conversation:
FWIW: We considered cython as one of the original options for setting up integration. It would have worked, but would not have allowed us to autogenerate nearly so much of the interface as the python integration currently does.
Doing multiple python versions from our existing setup is awkward, but shouldn’t actually require any meaningful changes for the flux python module. If someone goes in and configures, builds and installs flux with one python, then re-configures for a different python and builds and installs, the module produced the second time should be compatible with the first. We don’t actually build anything into flux itself that requires a specific python, BUT we build into the flux python command the paths to the chosen interpreter and similar to ensure that the python you get is the one we built a module for.
The last time Stephen and I discussed how to handle this better, we came down to basically these options:
- Separate the python module build from the rest of flux, either:
- Make its auto tools build independent so we can re-configure just that component and rebuild it, this is probably the least work to make this easier, but it’s also by far the least useful.
- Finish setting up the setup.py build so that pip and friends can build a flux python module. This is much more useful in that it means users can build or download a tree from pip or similar and pop it in a virtualenv, it’s by far the better option for ergonomics, but it’s a larger rework. It also means that either the flux build process will need to export a special pseudo-header so that the module build can pick it up, or we have to ship enough with the build to be able to generate it from installed headers (this is not trivial).
- Re-work the python support to use something other than the base cffi stuff (this is what cython would take, don’t do this unless someone has at least two man-months to spend on it and nothing else) I seriously think option 1.2 is the right one, but it’s not a completely trivial transition. If someone with some python chops has some cycles to spend on it, I could probably hand over some half-finished work on it with some pointers in the short term, not sure when I would be able to make time to actually do that whole job.
The problem is if someone wants to run flux inside the vendored version of python from visit, or more realistically the in-tree version they vendor in ATS for some reason.
Is this because the vendored version of python doesn’t have all of the python module dependencies of Flux? If we document our python dependencies, will they be able to install them into their vendored python to satisfy these depdencies?
It’s not that really, it’s that we can’t set anything up to reasonably do that for them. If an end user has to find a pre-built copy of the flux source tree, re-configure it, and build a specific subcomponent to install it, we’ve lost.
BTW, one significant problem I saw was with conda as it is not only python package manager but also binary package manager.
A conda virtual environment can easily bring in our dependent binary packages (e.g., zero mq libraries) that are not ABI compatible with what Flux was built against, and that broke Flux.
Yeah, it can, and it tends to want to pull in everything from its own packages so it’s going to break stuff. So is pip to some extent, though less so since we could make it a source package and not link it to binaries for flux outside the system. In spack it could be set up to build flux and everything by default, but with a py-flux package to build the python module based on flux, supporting an external build of flux through the externals mechanism. That might be a comparatively painless way to make this easier come to think of it.
If the flux versions are ABI compatible (flux’s ABI has on average been very stable, kudos to all on that by the way), the python module should actually work with multiple flux installs as long as it loads the right libflux.so and is built for the right python. There are tricks to make this more true too actually, for example: https://github.com/pypa/manylinux . The manylinux project in particular makes it possible to build a binary wheel that pretty much targets all linux distros on a given architecture, and makes it easy to build that wheel for all meaningful versions of python. It’s possible, if annoying, to build the module portably for every python that matters and make it available.
As to your question grondo, they avoid the system version because none of the modules they need are installed in the system python’s environment. It’s all in the TCE versions, or a version vendored with something else. If TOSS and TCE would work together to get rid of some of that separation, that problem could go away.
As to version compatibility, the answer has changed over the years. For the most part no, they aren’t necessarily compatible, even within a given python version it’s easy to build a version of python that’s binary incompatible with another (RHEL’s version is built so as to be incompatible with almost everything, so integrating with it and a TCE one at the same time is almost impossible, thanks RedHat!). That said, they’re working on it: https://peps.python.org/pep-0387/ . Listing of current stable C API surface that could allow flux itself to embed python without bonding to a specific version: https://docs.python.org/3/c-api/stable.html .
A more limited but related issue is #2883. See also https://github.com/flux-framework/flux-core/issues/3929#issuecomment-1080926795
@jameshcorbett when you say:
I seriously think option 1.2 is the right one, but it’s not a completely trivial transition. If someone with some python chops has some cycles to spend on it, I could probably hand over some half-finished work on it with some pointers in the short term, not sure when I would be able to make time to actually do that whole job.
And 1.2 is in reference to:
Make its auto tools build independent so we can re-configure just that component and rebuild it, this is probably the least work to make this easier, but it’s also by far the least useful.
"Independent" as in - in its own repository? It would be nice to be able to more easily discover the source code, and package python-specific assets with it. This also seems like a challenge:
BUT we build into the flux python command the paths to the chosen interpreter and similar to ensure that the python you get is the one we built a module for.
I have Python chops but not as much C++ chops, so possibly this is something I could help with, given pointers. My 0.02 is that maintaining the flux python package might be easier if it's done cleanly in it's own repository, the reason being whenever I come looking for it I can't find easily. I also think a lot of people aren't going to use / want to use spack, so while that could help some subset of users it won't be the ideal solution. I'm willing to try some things if folks want to continue discussion and narrow down a plan.
My 0.02 is that maintaining the flux python package might be easier if it's done cleanly in it's own repository, the reason being whenever I come looking for it I can't find easily. I also think a lot of people aren't going to use / want to use spack, so while that could help some subset of users it won't be the ideal solution. I'm willing to try some things if folks want to continue discussion and narrow down a plan.
Many core commands and utilities are built using Python, so for better or worse there is a circular dependency between the Python bindings and flux-core. This will be easiest to maintain if we keep them in the same repo.
However, it would seem feasible to have a project that can build python bindings for any Python/flux-core version combination by downloading the targeted flux-core release, configuring and building the whole thing, but only doing make install from the src/bindings/python directory. That may not be ideal (and perhaps it won't work at all for things like pip? Beyond my ken). Sorry if these thoughts are unhelpful.
Yep - I was thinking a submodule or just clone would bring them together again, but I don't understand how it works well enough to say for sure.
@vsoch that wasn't my words, I was quoting from and summarizing a variety of emails, and the formatting got a bit messed up. Options 1-4 should have been options 1, 1a, 1b and 2, and "1.2" should have been "1b". Sorry. I'll forward you the email thread so you can see.
In short though 1.2 would be in reference to:
Finish setting up the setup.py build so that pip and friends can build a flux python module. This is much more useful in that it means users can build or download a tree from pip or similar and pop it in a virtualenv, it’s by far the better option for ergonomics, but it’s a larger rework. It also means that either the flux build process will need to export a special pseudo-header so that the module build can pick it up, or we have to ship enough with the build to be able to generate it from installed headers (this is not trivial).
@jameshcorbett to follow up here, I think this should work now with our bindings: https://pypi.org/project/flux-python/. As long as they choose the right version of the bindings to match the version of flux installed, they can install into a separate environment.
Sounds excellent!
Closing as completed, I guess.