vaex icon indicating copy to clipboard operation
vaex copied to clipboard

Support CPython 3.11, 3.12, and aarch64 processors

Open ddelange opened this issue 1 year ago • 54 comments

Hoi 👋

linux-aarch64 makes up for almost 10% of all platforms ref https://github.com/giampaolo/psutil/pull/2103

aarch64 has already surpassed windows in terms of downloads for this package. Oracle, Amazon, Google, and Microsoft are all offering aarch64 cloud instances at an undeniable price point compared to amd/intel, so the demand will undoubtedly only grow

  • this PR is adapted from https://github.com/MagicStack/asyncpg/pull/954
  • uses QEMU emulation for linux arm64 wheels: manylinux takes around 2.5hrs per wheel and alpine arm64 up to 4 hrs 😅
  • manylinux2014 wheels are built with GCC 10, which I think does not guarantee proper functioning of pybind11 (docs).
    • so with this PR, linux wheels are built with GCC 12 (manylinux_2_28).
    • pip will only install these wheels on linux operating systems with glibc >= 2.28 (mostly all 2020+ linux distributions like debian 10 buster, ubuntu 20.04 focal, almalinux/rhel 8, ...).

the wheels from this PR can be installed with:

# comma separated list for --find-links
export PIP_FIND_LINKS=https://github.com/ddelange/vaex/releases/expanded_assets/core-v4.17.1.post4
pip install --force-reinstall vaex

fixes #2366, fixes #2368, fixes #2397

ddelange avatar Jan 20 '23 14:01 ddelange

Hoi 👋

exciting, will take a look early next week!

  • manylinux takes around 2.5hrs per wheel and alpine arm64 up to 4 hrs

that worries me a bit.. :)

groeten,

Maarten

maartenbreddels avatar Jan 21 '23 08:01 maartenbreddels

here are all timings: https://github.com/ddelange/vaex/actions/runs/3965720337/usage

depending on how often a month you release vaex, this could eat into the 2k free minutes of GH...

as the parallelization is maximised and they're pushed to PyPI as soon as they're built, most of the wheels will be available soon upon release regardless

here are all the wheels: distributions.zip

ddelange avatar Jan 21 '23 09:01 ddelange

interestingly, that was 8260 minutes ^

apparently that's OK? then I don't understand their explanation 🤔 https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions#included-storage-and-minutes

ddelange avatar Jan 21 '23 10:01 ddelange

ah there is a fair amount of duplication in that usage table for whatever reason 🤯

ddelange avatar Jan 21 '23 10:01 ddelange

a diff of current PyPI vs the zip above:

 vaex_core-4.16.1-cp310-cp310-macosx_10_9_x86_64.whl
 vaex_core-4.16.1-cp310-cp310-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp310-cp310-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp310-cp310-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp310-cp310-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp310-cp310-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp310-cp310-win_amd64.whl
+vaex_core-4.16.1-cp311-cp311-macosx_10_9_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-macosx_11_0_arm64.whl
+vaex_core-4.16.1-cp311-cp311-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp311-cp311-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-musllinux_1_1_aarch64.whl
+vaex_core-4.16.1-cp311-cp311-musllinux_1_1_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-win_amd64.whl
 vaex_core-4.16.1-cp36-cp36m-macosx_10_9_x86_64.whl
-vaex_core-4.16.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp36-cp36m-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp36-cp36m-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp36-cp36m-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp36-cp36m-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp36-cp36m-win_amd64.whl
 vaex_core-4.16.1-cp37-cp37m-macosx_10_9_x86_64.whl
-vaex_core-4.16.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp37-cp37m-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp37-cp37m-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp37-cp37m-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp37-cp37m-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp37-cp37m-win_amd64.whl
 vaex_core-4.16.1-cp38-cp38-macosx_10_9_x86_64.whl
 vaex_core-4.16.1-cp38-cp38-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp38-cp38-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp38-cp38-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp38-cp38-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp38-cp38-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp38-cp38-win_amd64.whl
 vaex_core-4.16.1-cp39-cp39-macosx_10_9_x86_64.whl
 vaex_core-4.16.1-cp39-cp39-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp39-cp39-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp39-cp39-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp39-cp39-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp39-cp39-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp39-cp39-win_amd64.whl

ddelange avatar Jan 22 '23 10:01 ddelange

I'm guessing this is blocked by https://github.com/vaexio/vaex/pull/2339

ddelange avatar Feb 27 '23 07:02 ddelange

Just letting you know i'm very busy and had a vacation. Yes, I'll try to get https://github.com/vaexio/vaex/pull/2339 green first!

maartenbreddels avatar Mar 06 '23 13:03 maartenbreddels

fwiw there are now third party free minutes on native arm64 machines, to get rid of the slow qemu builds

ddelange avatar May 25 '23 23:05 ddelange

Could you try rebasing this?

maartenbreddels avatar Jul 10 '23 20:07 maartenbreddels

@maartenbreddels already merged in master 👍

ddelange avatar Jul 10 '23 20:07 ddelange

    ERROR: Could not find a version that satisfies the requirement vaex-core<4.17,>=4.17.0 (from vaex)
    ERROR: No matching distribution found for vaex-core<4.17,>=4.17.0

ddelange avatar Jul 11 '23 06:07 ddelange

Yeah, a bug/artifact or our release script. Should be good now.

maartenbreddels avatar Jul 11 '23 15:07 maartenbreddels

hoi @maartenbreddels 👋

I pulled master and fixed merge conflicts, but it looks like CI is still not very happy. Seeing errors like hdf file missing on disk, and TypeError: train() got an unexpected keyword argument 'early_stopping_rounds'.

Do you think it might be related to this PR?

ddelange avatar Aug 03 '23 20:08 ddelange

Just wondering here on the Python packaging. Python 3.6 and 3.7 are now deprecated on the other hand we can bump to 3.10 and 3.11?

franz101 avatar Aug 17 '23 20:08 franz101

Do we have any updates on this MR?

to-bee avatar Aug 28 '23 07:08 to-bee

HI @maartenbreddels 👋

Was your s3 account deleted by any chance?

vaex.open('s3://vaex/taxi/yellow_taxi_2009_2015_f32.hdf5?anon=true')

raises

FileNotFoundError: [Errno 2] Path does not exist 'vaex/taxi/yellow_taxi_2009_2015_f32.hdf5'. Detail: [errno 2] No such file or directory
image

ddelange avatar Sep 01 '23 08:09 ddelange

As of October 2nd, Python 3.12 is in general availability. Might as well include it here? cibuildwheel should start building the wheels automatically now that cp312 is GA (it parses vaex's python_requires), so no additional action is needed probably. Some dependencies might lack 3.12 wheels as of now, so users would build them from source.

ddelange avatar Oct 07 '23 18:10 ddelange

Hey folks, what's the ETA on this one? I see it's been on and off for ~9 months now. Would be great to have Python 3.11 support.

setu4993 avatar Oct 12 '23 23:10 setu4993

@maartenbreddels we might have to drop support for cp36 and cp37 cp37-musllinux_aarch64.log.txt edit: failures are only for cp37-musllinux_aarch64 and cp38-musllinux_aarch64.

meanwhile, I've added two commits above to upload wheels as github release assets on my fork.

so now, the wheels from this PR can be installed with:

pip install vaex --force-reinstall --find-links https://github.com/ddelange/vaex/releases/expanded_assets/core-v4.17.1.post4

please report issues back here :)

ddelange avatar Oct 14 '23 09:10 ddelange

3.12 wheel build is not happy yet, and the traceback isn't really helpful here: cp312-manylinux_x86_64.txt

ddelange avatar Oct 19 '23 06:10 ddelange

3.12 wheel build is not happy yet, and the traceback isn't really helpful here: cp312-manylinux_x86_64.txt

I guess the same problems like here: https://stackoverflow.com/questions/77274572/multiqc-modulenotfounderror-no-module-named-imp

Henkhogan avatar Oct 20 '23 07:10 Henkhogan

Hi @Henkhogan 👋

Great catch! vaex still uses the imp module to load the version, which was deprecated in py3.12. Let me fix that :)

ddelange avatar Oct 20 '23 08:10 ddelange

See commit above. Wheels are building now, let's see

ddelange avatar Oct 20 '23 09:10 ddelange

fwiw @maartenbreddels the official (PyPA) way of using git tags is by switching from setup.py to pyproject.toml, and adding setuptools_scm and (in the case of this repository) using the tag_regex param to get the git tag of the corresponding subpackage ref https://setuptools-scm.readthedocs.io/en/latest/config/#configuration-parameters.

Here's a reference PR, including dynamically populating the __version__ variable in __init__.py

ddelange avatar Oct 20 '23 09:10 ddelange

cp312 wheels coming online 🎉

updated the pip install link in my earlier comment

ddelange avatar Oct 20 '23 10:10 ddelange

Thanks a lot for this effort! What's needed to merge this PR, get a new release tagged and Python 3.12 wheels uploaded to PyPI?

EwoutH avatar Oct 26 '23 09:10 EwoutH

@EwoutH CI is still failing due to https://github.com/vaexio/vaex/pull/2331#issuecomment-1702344845

ddelange avatar Oct 26 '23 09:10 ddelange

Can we (temporarily) host these files somewhere else? Maybe even here on GitHub, or as a gist?

EwoutH avatar Oct 26 '23 10:10 EwoutH

@maartenbreddels do you have the CI files somewhere?

ddelange avatar Oct 26 '23 10:10 ddelange

What's the path to getting this merged? There's a lot downstream being blocked here.

longmathemagician avatar Dec 08 '23 17:12 longmathemagician