vaex
vaex copied to clipboard
Support CPython 3.11, 3.12, and aarch64 processors
Hoi 👋
linux-aarch64 makes up for almost 10% of all platforms ref https://github.com/giampaolo/psutil/pull/2103
aarch64 has already surpassed windows in terms of downloads for this package. Oracle, Amazon, Google, and Microsoft are all offering aarch64 cloud instances at an undeniable price point compared to amd/intel, so the demand will undoubtedly only grow
- this PR is adapted from https://github.com/MagicStack/asyncpg/pull/954
- uses QEMU emulation for linux arm64 wheels: manylinux takes around 2.5hrs per wheel and alpine arm64 up to 4 hrs 😅
- manylinux2014 wheels are built with GCC 10, which I think does not guarantee proper functioning of pybind11 (docs).
- so with this PR, linux wheels are built with GCC 12 (
manylinux_2_28
). - pip will only install these wheels on linux operating systems with glibc >= 2.28 (mostly all 2020+ linux distributions like debian 10 buster, ubuntu 20.04 focal, almalinux/rhel 8, ...).
- so with this PR, linux wheels are built with GCC 12 (
the wheels from this PR can be installed with:
# comma separated list for --find-links
export PIP_FIND_LINKS=https://github.com/ddelange/vaex/releases/expanded_assets/core-v4.17.1.post4
pip install --force-reinstall vaex
fixes #2366, fixes #2368, fixes #2397
Hoi 👋
exciting, will take a look early next week!
- manylinux takes around 2.5hrs per wheel and alpine arm64 up to 4 hrs
that worries me a bit.. :)
groeten,
Maarten
here are all timings: https://github.com/ddelange/vaex/actions/runs/3965720337/usage
depending on how often a month you release vaex, this could eat into the 2k free minutes of GH...
as the parallelization is maximised and they're pushed to PyPI as soon as they're built, most of the wheels will be available soon upon release regardless
here are all the wheels: distributions.zip
interestingly, that was 8260 minutes ^
apparently that's OK? then I don't understand their explanation 🤔 https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions#included-storage-and-minutes
ah there is a fair amount of duplication in that usage table for whatever reason 🤯
a diff of current PyPI vs the zip above:
vaex_core-4.16.1-cp310-cp310-macosx_10_9_x86_64.whl
vaex_core-4.16.1-cp310-cp310-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp310-cp310-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp310-cp310-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp310-cp310-musllinux_1_1_aarch64.whl
vaex_core-4.16.1-cp310-cp310-musllinux_1_1_x86_64.whl
vaex_core-4.16.1-cp310-cp310-win_amd64.whl
+vaex_core-4.16.1-cp311-cp311-macosx_10_9_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-macosx_11_0_arm64.whl
+vaex_core-4.16.1-cp311-cp311-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp311-cp311-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-musllinux_1_1_aarch64.whl
+vaex_core-4.16.1-cp311-cp311-musllinux_1_1_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-win_amd64.whl
vaex_core-4.16.1-cp36-cp36m-macosx_10_9_x86_64.whl
-vaex_core-4.16.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp36-cp36m-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp36-cp36m-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp36-cp36m-musllinux_1_1_aarch64.whl
vaex_core-4.16.1-cp36-cp36m-musllinux_1_1_x86_64.whl
vaex_core-4.16.1-cp36-cp36m-win_amd64.whl
vaex_core-4.16.1-cp37-cp37m-macosx_10_9_x86_64.whl
-vaex_core-4.16.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp37-cp37m-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp37-cp37m-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp37-cp37m-musllinux_1_1_aarch64.whl
vaex_core-4.16.1-cp37-cp37m-musllinux_1_1_x86_64.whl
vaex_core-4.16.1-cp37-cp37m-win_amd64.whl
vaex_core-4.16.1-cp38-cp38-macosx_10_9_x86_64.whl
vaex_core-4.16.1-cp38-cp38-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp38-cp38-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp38-cp38-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp38-cp38-musllinux_1_1_aarch64.whl
vaex_core-4.16.1-cp38-cp38-musllinux_1_1_x86_64.whl
vaex_core-4.16.1-cp38-cp38-win_amd64.whl
vaex_core-4.16.1-cp39-cp39-macosx_10_9_x86_64.whl
vaex_core-4.16.1-cp39-cp39-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp39-cp39-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp39-cp39-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp39-cp39-musllinux_1_1_aarch64.whl
vaex_core-4.16.1-cp39-cp39-musllinux_1_1_x86_64.whl
vaex_core-4.16.1-cp39-cp39-win_amd64.whl
I'm guessing this is blocked by https://github.com/vaexio/vaex/pull/2339
Just letting you know i'm very busy and had a vacation. Yes, I'll try to get https://github.com/vaexio/vaex/pull/2339 green first!
fwiw there are now third party free minutes on native arm64 machines, to get rid of the slow qemu builds
Could you try rebasing this?
@maartenbreddels already merged in master 👍
ERROR: Could not find a version that satisfies the requirement vaex-core<4.17,>=4.17.0 (from vaex)
ERROR: No matching distribution found for vaex-core<4.17,>=4.17.0
Yeah, a bug/artifact or our release script. Should be good now.
hoi @maartenbreddels 👋
I pulled master and fixed merge conflicts, but it looks like CI is still not very happy. Seeing errors like hdf file missing on disk, and TypeError: train() got an unexpected keyword argument 'early_stopping_rounds'
.
Do you think it might be related to this PR?
Just wondering here on the Python packaging. Python 3.6 and 3.7 are now deprecated on the other hand we can bump to 3.10 and 3.11?
Do we have any updates on this MR?
HI @maartenbreddels 👋
Was your s3 account deleted by any chance?
vaex.open('s3://vaex/taxi/yellow_taxi_2009_2015_f32.hdf5?anon=true')
raises
FileNotFoundError: [Errno 2] Path does not exist 'vaex/taxi/yellow_taxi_2009_2015_f32.hdf5'. Detail: [errno 2] No such file or directory
As of October 2nd, Python 3.12 is in general availability. Might as well include it here? cibuildwheel should start building the wheels automatically now that cp312 is GA (it parses vaex's python_requires
), so no additional action is needed probably. Some dependencies might lack 3.12 wheels as of now, so users would build them from source.
Hey folks, what's the ETA on this one? I see it's been on and off for ~9 months now. Would be great to have Python 3.11 support.
@maartenbreddels we might have to drop support for cp36 and cp37
cp37-musllinux_aarch64.log.txt
edit: failures are only for cp37-musllinux_aarch64
and cp38-musllinux_aarch64
.
meanwhile, I've added two commits above to upload wheels as github release assets on my fork.
so now, the wheels from this PR can be installed with:
pip install vaex --force-reinstall --find-links https://github.com/ddelange/vaex/releases/expanded_assets/core-v4.17.1.post4
please report issues back here :)
3.12 wheel build is not happy yet, and the traceback isn't really helpful here: cp312-manylinux_x86_64.txt
3.12 wheel build is not happy yet, and the traceback isn't really helpful here: cp312-manylinux_x86_64.txt
I guess the same problems like here: https://stackoverflow.com/questions/77274572/multiqc-modulenotfounderror-no-module-named-imp
Hi @Henkhogan 👋
Great catch! vaex still uses the imp
module to load the version, which was deprecated in py3.12. Let me fix that :)
See commit above. Wheels are building now, let's see
fwiw @maartenbreddels the official (PyPA) way of using git tags is by switching from setup.py to pyproject.toml, and adding setuptools_scm
and (in the case of this repository) using the tag_regex
param to get the git tag of the corresponding subpackage ref https://setuptools-scm.readthedocs.io/en/latest/config/#configuration-parameters.
Here's a reference PR, including dynamically populating the __version__
variable in __init__.py
cp312 wheels coming online 🎉
updated the pip install
link in my earlier comment
Thanks a lot for this effort! What's needed to merge this PR, get a new release tagged and Python 3.12 wheels uploaded to PyPI?
@EwoutH CI is still failing due to https://github.com/vaexio/vaex/pull/2331#issuecomment-1702344845
Can we (temporarily) host these files somewhere else? Maybe even here on GitHub, or as a gist?
@maartenbreddels do you have the CI files somewhere?
What's the path to getting this merged? There's a lot downstream being blocked here.