[WIP]: Updates for Pyodide builds after `pyodide-build` was unvendored
[!Tip] This PR is currently a work in progress and is not yet ready for review. However, feedback is welcome. The text below is a stub and will be updated with code changes.
This PR updates the Pyodide build procedure (see #1456) that is enabled with CIBW_PLATFORM: "pyodide" (or with the --platform pyodide CLI equivalent) post the changes in https://github.com/pyodide/pyodide/pull/4882, where pyodide/pyodide-build was unvendored from the main Pyodide repository to accommodate faster updates and fixes.
This means that the Pyodide version and pyodide-build are not going to be in sync going forward, and that the Pyodide xbuildenv to install must be inferred by the versions available to install by pyodide-build through a recently added pyodide xbuildenv search command, which prints out this table:
Tap to expand table
Starting new HTTPS connection (1): raw.githubusercontent.com:443
https://raw.githubusercontent.com:443 "GET /pyodide/pyodide/main/pyodide-cross-build-environments.json HTTP/11" 200 917
Version Python Emscripten pyodide-build Compatible
---------- ---------- ---------- ------------------------- ----------
0.27.0a2 3.12.1 3.1.58 0.26.0 - Yes
0.26.2 3.12.1 3.1.58 0.26.0 - Yes
0.26.1 3.12.1 3.1.58 0.26.0 - Yes
0.26.0 3.12.1 3.1.58 0.26.0 - Yes
Alternatively, one may use pyodide xbuildenv search --all to return both compatible and non-compatible versions. This would, however, be better received as JSON (please see https://github.com/pyodide/pyodide-build/pull/28).
Additionally, in this PR, I would like to implement support for installing arbitrary Pyodide versions, or, more specifically, arbitrary Pyodide xbuildenv versions – though, only the ones that are supported for a given pyodide-build version. This could be done through an environment variable PYODIDE_XBUILDENV_VERSION (or, PYODIDE_VERSION because it's shorter) and an associated configuration variable in the schema. It would therefore be great to get the above table in machine-readable code to validate it inside cibuildwheel/pyodide.py – for which I opened https://github.com/pyodide/pyodide-build/issues/26 and I'm working on that right now. The rationale behind this is that WebAssembly/Pyodide builds are already experimental anyway, and it would be useful to not tie the available Pyodide version to the cibuildwheel version – this would be helpful for downstream projects (https://github.com/statsmodels/statsmodels/pull/9343, https://github.com/scikit-image/scikit-image/pull/7525, https://github.com/scikit-learn/scikit-learn/pull/29791/, and so on) to allow testing against Pyodide alpha releases and/or for the case of reproducibility against Pyodide's older releases.
cc: @hoodmane and @ryanking13 for visibility
The Windows test failures are unrelated. I'll try to fix them later in the day, but happy to step back if someone else does it before me, or wishes to.
The failing CircleCI-Linux-Python312 test should be unrelated.
Generally looks good to me. I'd like to see:
- pyodide_version override handled like all the other options
- an override for pyodide_build_version
I'll check 1) in a bit, but for 2), we can do this with the current structure and use a newer pyodide-build version via the CIBW_DEPENDENCY_VERSIONS option (or the more specific CIBW_DEPENDENCY_VERSIONS_PYODIDE) by supplying a custom constraints file. Is this what you intended?
Also, I think I should add a test workflow (or another job through a matrix) that builds Pyodide wheels with the new CIBW_PYODIDE_VERSION variable set so that the functionality remains intact.
Wild idea (probably), but what about allowing the version to be part of the platform? So --platform pyodide would be the default version, but --platform pyodide-0.27.0a2 would be allowed, too. This would tie the pyodide-specific setting to the pyodide platform only. But this would not allow you to require a specific version inside pyproject.toml, I guess. Thoughts?
RE: an override for pyodide_build_version
Can't this be done like overriding any other constrained package? You could set the versions to latest, or even pip install a specific version in before-build?
Wild idea (probably), but what about allowing the version to be part of the platform? So
--platform pyodidewould be the default version, but--platform pyodide-0.27.0a2would be allowed, too. This would tie the pyodide-specific setting to the pyodide platform only. But this would not allow you to require a specific version inside pyproject.toml, I guess. Thoughts?
I would be fine with implementing this, but in principle (and to be slightly pedantic), "pyodide" would generally be the platform and "0.27.0a2" would be the version, so mixing the platform name and the version into one string sounds a bit odd to me – given that we don't have a "linux-cp313" of sorts in https://cibuildwheel.pypa.io/en/stable/options/#platform, right?
RE: an override for pyodide_build_version
Can't this be done like overriding any other constrained package? You could set the versions to
latest, or even pip install a specific version in before-build?
Yes, I had the same suggestion in https://github.com/pypa/cibuildwheel/pull/2002#issuecomment-2491146041, but TIL that before-build can also override, that is probably neater for those who would like to explore newer versions without pinning other constraints.
If linux-cp313 was valid then pyodide-cp313 would be valid too, which is the same thing or helpful; I don't think that's a correct comparison. Maybe linux-manylinux2014 would be a better comparison. (Which we don't allow)
Unless someone really likes the idea, let's keep going with pyodide-version. My only thought is that it's a general config setting only applicable to a specific platform. Though we do have a few of those (the manylinux settings, for example).
RE: an override for pyodide_build_version
Can't this be done like overriding any other constrained package? You could set the versions to
latest, or even pip install a specific version in before-build?
Ah, you raise a good point! If we have an option ~CIBW_PYODIDE_VERSION~ CIBW_PYODIDE_BUILD_VERSION and the user customises it, well we still have constraints in cibuildwheel/resources/constraints-pyodide312.txt that are for a different version. That's inconsistent/inefficient at best, error-prone at worst.
So, yes, the other approach would be to read the contents of CIBW_DEPENDENCY_VERSIONS to get the pyodide-build version. We already do something similar here, it's fairly easy using packaging.requirements.Requirement
https://github.com/henryiii/cibuildwheel/blob/6f9ad0b8989d9c476e7bb42b1b52536a19a5328b/cibuildwheel/util.py#L648-L692
The other thing that would help here is a way to specify CIBW_DEPENDENCY_VERSIONS inline, without the extra file. Something like:
CIBW_DEPENDENCY_VERSIONS_PYODIDE: "requirements: pyodide-build==0.29.1"
(I don't love the requirements: marker but can't see a way to reliably distinguish from a file otherwise)
edit: I think I was a little confused about the difference between the PYODIDE_VERSION and the pyodide-build version above.
To correct my above comment - I was confused about the version of pyodide-build versus pyodide itself - it seems that the option CIBW_PYODIDE_VERSION is still necessary, but that the ability to configure the package pyodide-build might also be useful, though is better done through CIBW_DEPENDENCY_VERSIONS.
It looks like the next step here is to make CIBW_PYODIDE_VERSION a proper option, with documentation. Would you like assistance with that @agriyakhetarpal ? I can certainly help with the options spec/parsing bit, though I think you might be better placed to do the documentation bit, there might be nuance/guidance I'm unaware of.
I've pushed a change to make the option read properly through options.py, and it removes some hardcoding of pyodide-build and emscripten - pyodide-build is already spec'd in dependency-version constraints, and an emscripten version can be read from pyodide-build's output at runtime.
Still to do -
- [x] options TOML schema update
- [x] tests
- [x] docs for new option
pyodide-version.
Added docs and tests. Ideas welcome for a way to test pyodide-version itself! Ideally we'd assert that the package is built with the specific version. That said I'm happy enough with the current coverage.
@agriyakhetarpal, I hope you don't mind me taking a run at this! I was reviewing it again and found myself forming opinions about how the versions are pinned, so I wanted to see if it worked.
Another thing that I noticed is that pyodide version updates aren't automated yet. pyodide xbuildenv search --json should be a fairly easy way to automate that in bin/update_pythons.py.
Hi @joerick, thank you so much! Also, apologies for the radio silence here – I couldn't take a look in January, but I'm happy to see it through! #2122 looks like it was a beneficial improvement.
I have a few comments that we should take a look at. I haven't gone through the new code changes you pushed fully, so I apologise if these have already been resolved in some form. Here are the primary blockers on this PR that I had previously noticed when I was working on it more actively late last year:
- pyodide-build has a strict requirement on the Python version being used. For example, the cibuildwheel action sets up Python 3.12 to be compatible with Pyodide 0.27, so it's fine to use
CIBW_PYODIDE_VERSIONwith a newer Pyodide version which also has the same CPython version (we usually try to update to a new CPython version after eighteen months – it's usually @hoodmane who takes it up). However, this wouldn't work with Pyodide 0.28, which will ship with CPython 3.13 (see https://github.com/pyodide/pyodide/pull/5498). Thus, settingCIBW_PYODIDE_VERSIONto a newer one will breakcibuildwheel. There are two ways to resolve this:- either make cibuildwheel somehow aware of what version of
setup-pythonit should set up for Pyodide, based on a precomputed list of Pyodide vs CPython versions that can be either maintained on the cibuildwheel side or on the Pyodide side (perhaps we can include a Python version key in https://github.com/pyodide/pyodide/blob/main/pyodide-cross-build-environments.json that cibuildwheel can just read?); - or, make it possible on the Pyodide tooling side to cross-compile from one CPython version to another Pyodide's CPython version, i.e., relax
pyodide-build's Python version requirement and make it support compiling tocp312-pyodide_wasm32from, say, when it's installed in CPython 3.13. This is the same limitation as in https://github.com/benfogle/crossenv and is more challenging to resolve.
- either make cibuildwheel somehow aware of what version of
- I wonder if we should allow building for multiple Pyodide versions at a time, similar to PyPy. One of the goals I've been working on over the past year is using the current Pyodide support in cibuildwheel to build nightly wheels for use in interactive documentation deployments. So, if there's a JupyterLite deployment that uses a specific Pyodide version, say 0.27, and the package maintainers update cibuildwheel to one that supports Pyodide 0.28, nightly/dev docs deployments would break because Pyodide 0.28 would build wheels for a new Pyodide ABI (which wouldn't be compatible with Pyodide 0.27 deployed in the docs job). If cibuildwheel were to build wheels for both Pyodide 0.27 and Pyodide 0.28 (and keep adding new Pyodide versions as they release) and allow skipping a particular Pyodide version using its identifier through
CIBW_BUILDandCIBW_SKIPoptions, that would make interactive docs more reliable. We could document this behaviour more notably, noting that building multiple Pyodide wheels is a Pyodide-specific case and users should explicitly set what version(s) to build/skip. I've also discussed this aspect here: https://github.com/scikit-learn/scikit-learn/pull/29791#issuecomment-2750983225
pyodide-build has a strict requirement on the Python version being used
I think @ryanking13's plan is that we will relax this, in the sense that we will continue supporting Python 3.12 in pyodide-build even after we upgrade to using Python 3.13. But what does always need to be guaranteed is that target Python version == build Python version. So if we're building a wheel with abi tag pyodide_2025_0 for Pyodide 0.28, the build machine needs to use Python 3.13. If we're building a wheel with abi tag pyodide_2024_0 for Pyodide 0.27 and 0.26, then the build machine needs to use Python 3.12.
pyodide-build has a strict requirement on the Python version being used
Ah, interesting. I didn't know that! This isn't too tricky, I think. The way I'd suggest to approach this is to read the python version we need from pyodide xbuildenv search --json (the python version is already listed there, as well as pyodide-cross-build-environments.json) and install it from astral-sh/python-build-standalone. Q: I assume we don't need to worry about patch versions here?
Previously we've avoided using 3rd-party distributions of CPython, for fear of producing binaries with poor compatibility, but in this case we only need it to run the build, there's no implicit linking going on, right?
I wonder if we should allow building for multiple Pyodide versions at a time, similar to PyPy
In cibuildwheel lingo, this would amount to putting the Pyodide version into the build identifier. Aside: we don't actually do this for PyPy, we're only building the latest PyPy version per Python minor version, even if there are multiple PyPy ABIs within each minor. That doesn't mean we couldn't do it for Pyodide.
if there's a JupyterLite deployment that uses a specific Pyodide version, say 0.27, and the package maintainers update cibuildwheel to one that supports Pyodide 0.28, nightly/dev docs deployments would break because Pyodide 0.28 would build wheels for a new Pyodide ABI (which wouldn't be compatible with Pyodide 0.27 deployed in the docs job).
I've been skimming @hoodmane's draft PEP 776 re. emscripten. Wouldn't a pyodide_2025_0 wheel be forward compatible with a version of Pyodide that is released later? I found this in the draft PEP:
In order to balance the ABI stability needs of package maintainers with the ABI flexibility to allow the platform to move forward, Pyodide plans to adopt a new ABI for each feature release of Python.
If that's the case, (i.e. a 1:1 mapping between Python minor version and wheel ABI) I think keeping the build identifier tied to the Python minor version should suffice. Please correct me if I'm missing something though!
Q: I assume we don't need to worry about patch versions here?
That's correct, we do not need to worry about patch python versions.
I think @ryanking13's plan is that we will relax this, in the sense that we will continue supporting Python 3.12 in pyodide-build even after we upgrade to using Python 3.13. But what does always need to be guaranteed is that target Python version == build Python version.
Yes, exactly.
I think keeping the build identifier tied to the Python minor version should suffice. Please correct me if I'm missing something though!
If a package uses numpy or scipy at build time, it may be sensitive to the specific Pyodide version and not just the Python minor version. But only insofar as it depends on a specific numpy/scipy version, and this dependency should be clear from its Requires-Dist information. So I agree that putting the Python minor version in the build identifier will suffice.
Previously we've avoided using 3rd-party distributions of CPython, for fear of producing binaries with poor compatibility, but in this case we only need it to run the build, there's no implicit linking going on, right?
That's right, if pyodide-build is functioning correctly we shouldn't be using any headers or libs from the build Python.
Wouldn't a pyodide_2025_0 wheel be forward compatible with a version of Pyodide that is released later?
Yes, assuming that we first determine the pyodide_2025_0 ABI and implement it in pyodide-build and then release pyodide-build. The pyodide_2025_0 isn't stable yet though so currently it's not a good idea to distribute wheels with that platform tag except for experiments.
Thanks for the responses @hoodmane and Pyodide folks!
So I think the next thing to do would be to remove the implicit reliance on the host Python version, perhaps with python-build-standalone. That can be a follow-up PR, no need to add that here.
The
pyodide_2025_0isn't stable yet though so currently it's not a good idea to distribute wheels with that platform tag except for experiments.
That's cool, I was speaking hypothetically, as in, "once the ABI is stable".
If a package uses numpy or scipy at build time, it may be sensitive to the specific Pyodide version and not just the Python minor version.
Just so I understand this- is that because pyodide bundles these libraries? And is this just a build-time concern or would that also limit the compatibility of the built wheels?
is that because pyodide bundles these libraries?
Yes.
And is this just a build-time concern or would that also limit the compatibility of the built wheels?
I don't think it should limit compatibility of the built wheels beyond what they already say in their Requires Dist. If the wheel says it wants scipy >= 1.7 for instance then I think that is an assertion by the wheel that it works the same with scipy 1.7 and scipy 1.8 and can be build with either unless it has a more specific build_requires. If the wheel built against scipy 1.7 isn't compatible with scipy 1.8, then I think it's on the wheel to pin scipy==1.7, which would make it only compatible with Pyodide versions that bundle scipy 1.7. I don't think Pyodide specifically introduces any new limitations or special considerations here.
Based on these recent discussions, here's what I understand and propose:
-
We will continue to have the requirement/limitation of the xbuildenv/host Python version being the same as the Pyodide Python version for pyodide-build to operate.
- so, would the idea be that we'll download a Python binary from python-build-standalone in
cibuildwheel/platforms/pyodide.py, install it, installpyodide-buildin a virtualenv with it as the creator (similar to how macOS downloads CPython binaries), and compile the requested package to WASM – and we can get what Python version we need to install for whatever is supplied toCIBW_PYODIDE_VERSION:usingpyodide xbuildenv search --json --all? - the idea is that the PR wouldn't be usable without that, as the cibuildwheel GitHub Action won't be able to build against 0.28 when it lands, or even any nightly xbuildenv of Pyodide shall we implement grabbing it, as we've updated much later from Emscripten v3.1.58 and now bumped to Python 3.13 a few moments ago as well. This is a bit unfortunate, considering how convenient the GitHub Action is. However, it should be usable if someone were to do
python3.13 -m pip install cibuildwheel && cibuildwheel --platform pyodide, so maybe we should document this case – i.e., don't use the action or any other appropriate note? - or, should we add the
pyodide xbuildenv search --json --alllogic to the GitHub Action instead (perhaps through apipxstep) so that it picks up the Python version needed for the requested Pyodide version (if a build for Pyodide is requested, that is, otherwise not) and then passes that along as an input tosetup-python? It makes the action a bit more complex, but none of it is exposed that much to the user anyway and is probably minimal enough to incorporate.
- so, would the idea be that we'll download a Python binary from python-build-standalone in
-
Please feel free to push back on this thought, however, IMO, it's more elegant to do this:
teps: - uses: pypa/[email protected] env: CIBW_PLATFORM: pyodide CIBW_BUILD: "pyodide_2024_0 pyodide 2025_0" CIBW_TEST_REQUIRES_PYODIDE: "<...>" # and so on ...rather than to do this:
teps: - uses: pypa/[email protected] env: CIBW_PLATFORM: pyodide CIBW_TEST_REQUIRES_PYODIDE: "<...>" # and so on - uses: pypa/[email protected] env: CIBW_PLATFORM: pyodide CIBW_PYODIDE_VERSION: "0.XY" CIBW_TEST_REQUIRES_PYODIDE: "<...>" # and so on ...
- so, would the idea be that we'll download a Python binary from python-build-standalone in
cibuildwheel/platforms/pyodide.py, install it, installpyodide-buildin a virtualenv with it as the creator (similar to how macOS downloads CPython binaries)
Yeah, that's my proposal. That'll fix the issue with the implicit reliance on the version of the host python.
Please feel free to push back on this thought, however, IMO, it's more elegant to do this:
steps: - uses: pypa/[email protected] env: CIBW_PLATFORM: pyodide CIBW_BUILD: "pyodide_2024_0 pyodide 2025_0" CIBW_TEST_REQUIRES_PYODIDE: "<...>" # and so on ...rather than to do this:
steps: - uses: pypa/[email protected] env: CIBW_PLATFORM: pyodide CIBW_TEST_REQUIRES_PYODIDE: "<...>" # and so on - uses: pypa/[email protected] env: CIBW_PLATFORM: pyodide CIBW_PYODIDE_VERSION: "0.XY" CIBW_TEST_REQUIRES_PYODIDE: "<...>" # and so on ...
Not sure I'm on board with the build identifiers above, but I'd agree that we shouldn't have to run cibuildwheel twice in normal setups. But I don't see when such a setup would be required.
My understanding is that we'd create a new build identifier with each minor version of Python, because each ABI would be accompanied with a bump to the minor version of Python at the same time. See this conversion between myself and @hoodmane above:
PEP 776 draft: In order to balance the ABI stability needs of package maintainers with the ABI flexibility to allow the platform to move forward, Pyodide plans to adopt a new ABI for each feature release of Python.
@joerick: If that's the case, (i.e. a 1:1 mapping between Python minor version and wheel ABI) I think keeping the build identifier tied to the Python minor version should suffice. Please correct me if I'm missing something though!
@hoodmane: If a package uses numpy or scipy at build time, it may be sensitive to the specific Pyodide version and not just the Python minor version. But only insofar as it depends on a specific numpy/scipy version, and this dependency should be clear from its Requires-Dist information. So I agree that putting the Python minor version in the build identifier will suffice.
As such, your example above would look something like:
steps:
- uses: pypa/[email protected]
env:
CIBW_PLATFORM: pyodide
CIBW_BUILD: "cp312-pyodide_wasm32 cp313-pyodide_wasm32"
CIBW_TEST_REQUIRES_PYODIDE: "<...>" # and so on
...
That would be functionally equivalent to your initial example because pyodide would not change ABI within a Python minor version.
Unfortunately, the current failure is a known bug: https://github.com/pyodide/pyodide-build/issues/143
Unfortunately, the python-build-standalone issue with symlinks appears to be cropping up again, this time inside the pyodide venv used for testing.
Testing wheel...
+ pyodide venv /private/var/folders/ld/k24nt7054698bctspqwrjq1r0000gn/T/cibw-run-7aphwbz2/cp312-pyodide_wasm32/venv-test
Creating Pyodide virtualenv at
/private/var/folders/ld/k24nt7054698bctspqwrjq1r0000gn/T/cibw-run-7aphwbz2/cp312
-pyodide_wasm32/venv-test
... Configuring virtualenv
... Installing standard library
Successfully created Pyodide virtual environment!
+ which python
/private/var/folders/ld/k24nt7054698bctspqwrjq1r0000gn/T/cibw-run-7aphwbz2/cp312-pyodide_wasm32/venv-test/bin/python
+ pip install /private/var/folders/ld/k24nt7054698bctspqwrjq1r0000gn/T/cibw-run-7aphwbz2/cp312-pyodide_wasm32/repaired_wheel/spam-0.1.0-cp312-cp312-pyodide_2024_0_wasm32.whl
--------------------------------------------------------- Captured stderr call ---------------------------------------------------------
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Python path configuration:
PYTHONHOME = (not set)
PYTHONPATH = (not set)
program name = '/private/var/folders/ld/k24nt7054698bctspqwrjq1r0000gn/T/cibw-run-7aphwbz2/cp312-pyodide_wasm32/venv-test/bin/python3.12-host'
isolated = 0
environment = 1
user site = 0
safe_path = 0
import site = 1
is in build tree = 0
stdlib dir = '/install/lib/python3.12'
sys._base_executable = '/private/var/folders/ld/k24nt7054698bctspqwrjq1r0000gn/T/cibw-run-7aphwbz2/cp312-pyodide_wasm32/build/base/pbs-20250317-3.12/python/bin/python3.12'
sys.base_prefix = '/install'
sys.base_exec_prefix = '/install'
sys.platlibdir = 'lib'
sys.executable = '/private/var/folders/ld/k24nt7054698bctspqwrjq1r0000gn/T/cibw-run-7aphwbz2/cp312-pyodide_wasm32/venv-test/bin/python3.12-host'
sys.prefix = '/install'
sys.exec_prefix = '/install'
sys.path = [
'/install/lib/python312.zip',
'/install/lib/python3.12',
'/install/lib/python3.12/lib-dynload',
]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'
Current thread 0x00000001ff00c840 (most recent call first):
<no Python frame>
The issue appears to be related to https://github.com/astral-sh/python-build-standalone/issues/380 - previously (#2328) fixed by resolving the symlink before calling the binary. I'm not sure that would be possible in this case though, as pyodide venv is making the symlinks to create the cross-env, and I think that resolving the symlink would cause pip to install into the wrong virtualenv (?).
EDIT- As @agriyakhetarpal notes, https://github.com/pyodide/pyodide-build/issues/143 is the best reference.
If the approach that @hoodmane has to fix this in pyodide venv works, I see no reason for us not to immediately put out a new pyodide-build 0.31 release. :D
Okay, I'll actually make that PR if it's a blocker here.
I had a play with your proposed workaround here, @hoodmane - it kinda worked, but I had to do a couple extra things
- the
python3.12-hostbinary also needed updating (I just symlinked it topython-host) - the
pipbinary haspython3.12-hostas a shebang interpreter, which doesn't work (at least on macOS), because it's a script, not a binary file. The workaround is to call it using/usr/bin/env. I also removed the-s, as it's in thepython-hostscript now.
However, I now see that, although the pip executable worked, now the pytest executable doesn't! It gets the same error as before. Now I'm confused! Because shouldn't this be running in node, not in Python-land? Perhaps the PYTHONHOME variable was set wrong in python-host?