arrow icon indicating copy to clipboard operation
arrow copied to clipboard

ARROW-17487: [Python][Packaging][CI] Add support for Python 3.11

Open raulcd opened this issue 3 years ago • 42 comments

This PR adds jobs to build pyarrow wheels for Python 3.11.

raulcd avatar Oct 25 '22 11:10 raulcd

@github-actions crossbow submit cp311

raulcd avatar Oct 25 '22 11:10 raulcd

https://issues.apache.org/jira/browse/ARROW-17487

github-actions[bot] avatar Oct 25 '22 12:10 github-actions[bot]

:warning: Ticket has not been started in JIRA, please click 'Start Progress'.

github-actions[bot] avatar Oct 25 '22 12:10 github-actions[bot]

Revision: 8e2613ff74b28f04f6bc449e204eed451bfa61c2

Submitted crossbow builds: ursacomputing/crossbow @ actions-fd1ab80f49

Task Status
wheel-macos-big-sur-cp311-arm64 Github Actions
wheel-macos-big-sur-cp311-universal2 Github Actions
wheel-macos-mojave-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-arm64 Travis CI
wheel-windows-cp311-amd64 Github Actions

github-actions[bot] avatar Oct 25 '22 13:10 github-actions[bot]

@github-actions crossbow submit cp311

raulcd avatar Oct 25 '22 14:10 raulcd

Unable to match any tasks for `cp311`
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/3321608980

github-actions[bot] avatar Oct 25 '22 16:10 github-actions[bot]

@github-actions crossbow submit cp311

raulcd avatar Oct 25 '22 16:10 raulcd

Revision: 5dc9f31155413426b5d719bd8b1de3a6ef983afb

Submitted crossbow builds: ursacomputing/crossbow @ actions-3375acd0f2

Task Status
wheel-macos-big-sur-cp311-arm64 Github Actions
wheel-macos-big-sur-cp311-universal2 Github Actions
wheel-macos-mojave-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-arm64 Travis CI
wheel-windows-cp311-amd64 Github Actions

github-actions[bot] avatar Oct 25 '22 18:10 github-actions[bot]

@github-actions crossbow submit cp311

raulcd avatar Oct 26 '22 08:10 raulcd

Revision: 936164bee69dd42aee0d83d4b5e166709d821aac

Submitted crossbow builds: ursacomputing/crossbow @ actions-2255056a88

Task Status
wheel-macos-big-sur-cp311-arm64 Github Actions
wheel-macos-big-sur-cp311-universal2 Github Actions
wheel-macos-mojave-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-arm64 Travis CI
wheel-windows-cp311-amd64 Github Actions

github-actions[bot] avatar Oct 26 '22 11:10 github-actions[bot]

We are looking forward to this one being merged in Apache Airflow -> Pyarrow is one of the blocking factors to make Airflow work for Py3.11 and I am trying to make all the oss projects that we consided as friends :) a concerted effort to make Py3.11 support works - as Py 3.11 brings mainly huge improvements in performance that our users are eager to start using !

We track it in https://github.com/apache/airflow/pull/27264

If there is any help needed - happy to help also by talking to some dependencies of yours (which are likely also Airflow depenendencies). Good luck with it :)

potiuk avatar Oct 26 '22 11:10 potiuk

@raulcd Perhaps try applying this patch?

diff --git a/python/pyproject.toml b/python/pyproject.toml
index edbc4ade6..a799dc761 100644
--- a/python/pyproject.toml
+++ b/python/pyproject.toml
@@ -18,7 +18,7 @@
 [build-system]
 requires = [
     "cython >= 0.29.22",
-    "oldest-supported-numpy>=0.14",
+    "oldest-supported-numpy>=2022.8.16",
     "setuptools_scm",
     "setuptools >= 40.1.0",
     "wheel"
diff --git a/python/requirements-build.txt b/python/requirements-build.txt
index 46eb288c5..927c50d73 100644
--- a/python/requirements-build.txt
+++ b/python/requirements-build.txt
@@ -1,4 +1,4 @@
 cython>=0.29
-oldest-supported-numpy>=0.14
+oldest-supported-numpy>=2022.8.16
 setuptools_scm
 setuptools>=38.6.0
diff --git a/python/requirements-wheel-build.txt b/python/requirements-wheel-build.txt
index 856164f09..a48b30d35 100644
--- a/python/requirements-wheel-build.txt
+++ b/python/requirements-wheel-build.txt
@@ -1,5 +1,5 @@
 cython>=0.29.11
-oldest-supported-numpy>=0.14
+oldest-supported-numpy>=2022.8.16
 setuptools_scm
 setuptools>=58
 wheel
diff --git a/python/requirements-wheel-test.txt b/python/requirements-wheel-test.txt
index 1644b2f8b..665b2ce77 100644
--- a/python/requirements-wheel-test.txt
+++ b/python/requirements-wheel-test.txt
@@ -2,26 +2,8 @@ cffi
 cython
 hypothesis
 pickle5; platform_system != "Windows" and python_version < "3.8"
+oldest-supported-numpy>=2022.8.16
 pytest
 pytest-lazy-fixture
 pytz
 tzdata; sys_platform == 'win32'
-
-numpy==1.19.5; platform_system == "Linux"   and platform_machine == "aarch64" and python_version <  "3.7"
-numpy==1.21.3; platform_system == "Linux"   and platform_machine == "aarch64" and python_version >= "3.7"
-numpy==1.19.5; platform_system == "Linux"   and platform_machine != "aarch64" and python_version <  "3.9"
-numpy==1.21.3; platform_system == "Linux"   and platform_machine != "aarch64" and python_version >= "3.9"
-numpy==1.21.3; platform_system == "Darwin"  and platform_machine == "arm64"
-numpy==1.19.5; platform_system == "Darwin"  and platform_machine != "arm64"   and python_version <  "3.9"
-numpy==1.21.3; platform_system == "Darwin"  and platform_machine != "arm64"   and python_version >= "3.9"
-numpy==1.19.5; platform_system == "Windows"                                   and python_version <  "3.9"
-numpy==1.21.3; platform_system == "Windows"                                   and python_version >= "3.9"
-
-pandas<1.1.0;  platform_system == "Linux"   and platform_machine != "aarch64" and python_version <  "3.8"
-pandas;        platform_system == "Linux"   and platform_machine != "aarch64" and python_version >= "3.8"
-pandas;        platform_system == "Linux"   and platform_machine == "aarch64"
-pandas<1.1.0;  platform_system == "Darwin"  and platform_machine != "arm64"   and python_version <  "3.8"
-pandas;        platform_system == "Darwin"  and platform_machine != "arm64"   and python_version >= "3.8"
-pandas;        platform_system == "Darwin"  and platform_machine == "arm64"
-pandas<1.1.0;  platform_system == "Windows"                                   and python_version <  "3.8"
-pandas;        platform_system == "Windows"                                   and python_version >= "3.8"

pitrou avatar Oct 26 '22 12:10 pitrou

@raulcd Perhaps try applying this patch?

I tested the patch locally and while the build of the images is successful I got a lot of test failures:

640 failed, 3430 passed, 348 skipped, 15 xfailed, 2 xpassed, 5 warnings, 8 errors in 103.69s (0:01:43)

This is how I reproduce locally:

# generate wheel
PYTHON=3.11 docker-compose build --no-cache --progress plain python-wheel-manylinux-2014
PYTHON=3.11 docker-compose run --rm python-wheel-manylinux-2014
# test wheel
PYTHON=3.11 docker-compose build --no-cache python-wheel-manylinux-test-unittests
PYTHON=3.11 docker-compose run --rm python-wheel-manylinux-test-unittests

raulcd avatar Oct 26 '22 13:10 raulcd

Wheels are built successfully at the moment, I am going to trigger the job again to validate the MacOS ones but the jobs are failing due to 7 tests failing due to the change of behaviour of repr on the FileType enum, see: https://github.com/python/cpython/issues/94763 Thanks @jorisvandenbossche We probably can fix those on a following PR

raulcd avatar Oct 26 '22 13:10 raulcd

@github-actions crossbow submit cp311

raulcd avatar Oct 26 '22 13:10 raulcd

This patch should help fix the 3.11 enum issue:

diff --git a/python/pyarrow/_fs.pyx b/python/pyarrow/_fs.pyx
index e7b028a07..557c08149 100644
--- a/python/pyarrow/_fs.pyx
+++ b/python/pyarrow/_fs.pyx
@@ -78,6 +78,12 @@ cdef CFileType _unwrap_file_type(FileType ty) except *:
     assert 0
 
 
+def _file_type_to_string(ty):
+    # Python 3.11 changed str(IntEnum) to return the string representation
+    # of the integer value: https://github.com/python/cpython/issues/94763
+    return f"{ty.__class__.__name__}.{ty._name_}"
+
+
 cdef class FileInfo(_Weakrefable):
     """
     FileSystem entry info.
@@ -185,9 +191,10 @@ cdef class FileInfo(_Weakrefable):
             except ValueError:
                 return ''
 
-        s = '<FileInfo for {!r}: type={}'.format(self.path, str(self.type))
+        s = (f'<FileInfo for {self.path!r}: '
+             f'type={_file_type_to_string(self.type)}')
         if self.is_file:
-            s += ', size={}'.format(self.size)
+            s += f', size={self.size}'
         s += '>'
         return s
 

pitrou avatar Oct 26 '22 14:10 pitrou

Revision: d5adbac21b4bafda8c488b75a1fd122bdffc98e6

Submitted crossbow builds: ursacomputing/crossbow @ actions-f88a7ca39e

Task Status
wheel-macos-big-sur-cp311-arm64 Github Actions
wheel-macos-big-sur-cp311-universal2 Github Actions
wheel-macos-mojave-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-arm64 Travis CI
wheel-windows-cp311-amd64 Github Actions

github-actions[bot] avatar Oct 26 '22 15:10 github-actions[bot]

The tests are trying to compile grpcio, can we avoid that? https://github.com/ursacomputing/crossbow/actions/runs/3330588690/jobs/5509267855#step:11:116

Either install the GCS testbench on a different Python (with binary wheels), or don't test GCS at all on 3.11.

pitrou avatar Oct 26 '22 16:10 pitrou

@github-actions crossbow submit cp311

raulcd avatar Oct 26 '22 17:10 raulcd

Revision: fdac52b0490287e88978b694daf5083f3eaf40ac

Submitted crossbow builds: ursacomputing/crossbow @ actions-77a8b7ac66

Task Status
wheel-macos-big-sur-cp311-arm64 Github Actions
wheel-macos-big-sur-cp311-universal2 Github Actions
wheel-macos-mojave-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-amd64 Github Actions
wheel-manylinux2014-cp311-arm64 Travis CI
wheel-windows-cp311-amd64 Github Actions

github-actions[bot] avatar Oct 26 '22 20:10 github-actions[bot]

The only wheel job failing is wheel-macos-big-sur-cp311-universal2. There are issues resolving the numpy version from old_supported_numpy. I am trying to understand why this issue is only happening on this specific job:

Collecting oldest-supported-numpy>=0.14
  Using cached oldest_supported_numpy-2022.8.16-py3-none-any.whl (3.9 kB)
Collecting setuptools_scm
  Using cached setuptools_scm-7.0.5-py3-none-any.whl (42 kB)
Collecting setuptools>=58
  Using cached setuptools-65.5.0-py3-none-any.whl (1.2 MB)
Collecting wheel
  Using cached wheel-0.37.1-py2.py3-none-any.whl (35 kB)
Collecting oldest-supported-numpy>=0.14
  Using cached oldest_supported_numpy-2022.5.28-py3-none-any.whl (3.9 kB)
  Using cached oldest_supported_numpy-2022.5.27-py3-none-any.whl (3.9 kB)
  Using cached oldest_supported_numpy-2022.4.18-py3-none-any.whl (3.9 kB)
  Using cached oldest_supported_numpy-2022.4.10-py3-none-any.whl (3.9 kB)
  Using cached oldest_supported_numpy-2022.4.8-py3-none-any.whl (3.9 kB)
  Using cached oldest_supported_numpy-2022.3.27-py3-none-any.whl (3.9 kB)
  Using cached oldest_supported_numpy-2022.1.30-py3-none-any.whl (3.9 kB)
  Using cached oldest_supported_numpy-0.15-py3-none-any.whl (3.8 kB)
  Using cached oldest_supported_numpy-0.14-py3-none-any.whl (3.8 kB)
INFO: pip is looking at multiple versions of <Python from Requires-Python> to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of cython to determine which version is compatible with other requirements. This could take a while.
Collecting cython>=0.29.11
  Using cached Cython-0.29.30-py2.py3-none-any.whl (985 kB)
ERROR: Cannot install -r /Users/voltrondata/github-actions-runner/_work/crossbow/crossbow/arrow/python/requirements-wheel-build.txt (line 2) because these package versions have conflicting dependencies.

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts
The conflict is caused by:
    oldest-supported-numpy 2022.8.16 depends on numpy==1.23.2; python_version == "3.11" and platform_python_implementation != "PyPy"
    oldest-supported-numpy 2022.5.28 depends on numpy; python_version >= "3.11"
    oldest-supported-numpy 2022.5.27 depends on numpy; python_version >= "3.11"
    oldest-supported-numpy 2022.4.18 depends on numpy; python_version >= "3.11"
    oldest-supported-numpy 2022.4.10 depends on numpy; python_version >= "3.11"
    oldest-supported-numpy 2022.4.8 depends on numpy; python_version >= "3.11"
    oldest-supported-numpy 2022.3.27 depends on numpy; python_version >= "3.11"
    oldest-supported-numpy 2022.1.30 depends on numpy; python_version >= "3.11"
    oldest-supported-numpy 0.15 depends on numpy; python_version >= "3.11"
    oldest-supported-numpy 0.14 depends on numpy; python_version >= "3.11"

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

raulcd avatar Oct 26 '22 20:10 raulcd

@raulcd it seems like numpy arm64 wheels are only available for the macosx_11.0 while the failing job is trying to use the MACOSX_DEPLOYMENT_TARGET=10.14. Setting a correct target here should help

kleschenko avatar Oct 26 '22 23:10 kleschenko

it seems like numpy arm64 wheels are only available for the macosx_11.0

That's also the case for older Python versions, so if this is the reason it would be strange we only see it for 3.11?

I don't understand the version resolution conflict from pip. It seems total nonsense:

    oldest-supported-numpy 2022.8.16 depends on numpy==1.23.2; python_version == "3.11" and platform_python_implementation != "PyPy"
    oldest-supported-numpy 2022.5.28 depends on numpy; python_version >= "3.11"
    ...

How is that a conflict?

jorisvandenbossche avatar Oct 27 '22 07:10 jorisvandenbossche

it seems like numpy arm64 wheels are only available for the macosx_11.0

That's also the case for older Python versions, so if this is the reason it would be strange we only see it for 3.11?

Ah, but with older Python versions, oldest-supported-numpy will also pick an older numpy(eg 1.21.6 for Python 3.10: https://pypi.org/project/numpy/1.21.6/#files), and yes, older numpy versions have "universal" wheels with 10_9 deployment target

So it is here that the deployment target should be changed in case of 3.11:

https://github.com/apache/arrow/blob/c56934b57922a6cbb46eaef097a36ed8d2473467/dev/tasks/tasks.yml#L533-L539

(we would maybe also stop building universal2 wheels, since we provide also both arm64 and x86_64 wheels)

jorisvandenbossche avatar Oct 27 '22 07:10 jorisvandenbossche

For the record, I posted https://discuss.python.org/t/dependency-resolution-conflict-on-universal2-with-pip-22-3-and-python-3-11/20419

pitrou avatar Oct 27 '22 07:10 pitrou

So it is here that the deployment target should be changed in case of 3.11:

But why would that be necessary only for universal2, not arm64? Note that Numpy doesn't even provide universal2 wheels!

pitrou avatar Oct 27 '22 07:10 pitrou

(I'd also be in favor of not bothering with universal wheels, btw)

pitrou avatar Oct 27 '22 07:10 pitrou

But why would that be necessary only for universal2, not arm64? Note that Numpy doesn't even provide universal2 wheels!

Because for arm64 we set a different deployment target in tasks.yml, for that task we already set it to 11 And numpy did provide universal wheels for older numpy versions, so that's the reason it only fails for Python 3.11

jorisvandenbossche avatar Oct 27 '22 07:10 jorisvandenbossche

Oh, I see.

pitrou avatar Oct 27 '22 07:10 pitrou

@github-actions crossbow submit -g wheel

raulcd avatar Oct 27 '22 08:10 raulcd