arrow icon indicating copy to clipboard operation
arrow copied to clipboard

GH-48448: [Python] Implement Alphanumeric and Surrogate text in the random schema generator

Open HyukjinKwon opened this issue 2 weeks ago • 9 comments

Rationale for this change

To test various cases of field names:

https://github.com/apache/arrow/blob/6456944f5092dedb3f80d9bc80400e857d6571c7/python/pyarrow/tests/strategies.py#L49

It was introduced from https://github.com/apache/arrow/commit/9da458437162574f3e0d82e4a51dc6c1589b9f94

What changes are included in this PR?

This PR implements Alphanumeric and Surrogate text in the random schema generator

Are these changes tested?

Yes I tested them via:

PYARROW_TEST_HYPOTHESIS=1 pytest -xvs pyarrow/tests/test_strategies.py::test_fields --hypothesis-show-statistics

Are there any user-facing changes?

No, test-only.

  • GitHub Issue: #48448

HyukjinKwon avatar Dec 11 '25 03:12 HyukjinKwon

:warning: GitHub issue #48448 has been automatically assigned in GitHub to PR creator.

github-actions[bot] avatar Dec 11 '25 03:12 github-actions[bot]

@github-actions crossbow submit -g python

AlenkaF avatar Dec 12 '25 09:12 AlenkaF

(just for a bit of context, I am fixing hypothesis at https://github.com/apache/arrow/pull/48460 as it's broken, and @raulcd guided me to work on this at https://github.com/apache/arrow/pull/48443#pullrequestreview-3566500139 👍)

HyukjinKwon avatar Dec 12 '25 09:12 HyukjinKwon

Revision: de8e93942819a7d33f60b268728345a5e93b6c4d

Submitted crossbow builds: ursacomputing/crossbow @ actions-15cc39ef2c

Task Status
example-python-minimal-build-fedora-conda GitHub Actions
example-python-minimal-build-ubuntu-venv GitHub Actions
test-conda-python-3.10 GitHub Actions
test-conda-python-3.10-hdfs-2.9.2 GitHub Actions
test-conda-python-3.10-hdfs-3.2.1 GitHub Actions
test-conda-python-3.10-pandas-1.3.4-numpy-1.21.2 GitHub Actions
test-conda-python-3.11 GitHub Actions
test-conda-python-3.11-dask-latest GitHub Actions
test-conda-python-3.11-dask-upstream_devel GitHub Actions
test-conda-python-3.11-hypothesis GitHub Actions
test-conda-python-3.11-pandas-latest-numpy-latest GitHub Actions
test-conda-python-3.11-spark-master GitHub Actions
test-conda-python-3.12 GitHub Actions
test-conda-python-3.12-cpython-debug GitHub Actions
test-conda-python-3.12-pandas-latest-numpy-1.26 GitHub Actions
test-conda-python-3.12-pandas-latest-numpy-latest GitHub Actions
test-conda-python-3.13 GitHub Actions
test-conda-python-3.13-pandas-nightly-numpy-nightly GitHub Actions
test-conda-python-3.13-pandas-upstream_devel-numpy-nightly GitHub Actions
test-conda-python-3.14 GitHub Actions
test-conda-python-emscripten GitHub Actions
test-cuda-python-ubuntu-22.04-cuda-11.7.1 GitHub Actions
test-cuda-python-ubuntu-24.04-cuda-13.0.2 GitHub Actions
test-debian-12-python-3-amd64 GitHub Actions
test-debian-12-python-3-i386 GitHub Actions
test-fedora-42-python-3 GitHub Actions
test-ubuntu-22.04-python-3 GitHub Actions
test-ubuntu-22.04-python-313-freethreading GitHub Actions
test-ubuntu-24.04-python-3 GitHub Actions

github-actions[bot] avatar Dec 12 '25 09:12 github-actions[bot]

Oh yes! Forgot about that, thanks for the info 👍

AlenkaF avatar Dec 12 '25 09:12 AlenkaF

@github-actions crossbow submit -g python

HyukjinKwon avatar Dec 12 '25 22:12 HyukjinKwon

Revision: 1a13b4870c159076cba6059f7c87bce56447dbe4

Submitted crossbow builds: ursacomputing/crossbow @ actions-12f77fab36

Task Status
example-python-minimal-build-fedora-conda GitHub Actions
example-python-minimal-build-ubuntu-venv GitHub Actions
test-conda-python-3.10 GitHub Actions
test-conda-python-3.10-hdfs-2.9.2 GitHub Actions
test-conda-python-3.10-hdfs-3.2.1 GitHub Actions
test-conda-python-3.10-pandas-1.3.4-numpy-1.21.2 GitHub Actions
test-conda-python-3.11 GitHub Actions
test-conda-python-3.11-dask-latest GitHub Actions
test-conda-python-3.11-dask-upstream_devel GitHub Actions
test-conda-python-3.11-hypothesis GitHub Actions
test-conda-python-3.11-pandas-latest-numpy-latest GitHub Actions
test-conda-python-3.11-spark-master GitHub Actions
test-conda-python-3.12 GitHub Actions
test-conda-python-3.12-cpython-debug GitHub Actions
test-conda-python-3.12-pandas-latest-numpy-1.26 GitHub Actions
test-conda-python-3.12-pandas-latest-numpy-latest GitHub Actions
test-conda-python-3.13 GitHub Actions
test-conda-python-3.13-pandas-nightly-numpy-nightly GitHub Actions
test-conda-python-3.13-pandas-upstream_devel-numpy-nightly GitHub Actions
test-conda-python-3.14 GitHub Actions
test-conda-python-emscripten GitHub Actions
test-cuda-python-ubuntu-22.04-cuda-11.7.1 GitHub Actions
test-cuda-python-ubuntu-24.04-cuda-13.0.2 GitHub Actions
test-debian-12-python-3-amd64 GitHub Actions
test-debian-12-python-3-i386 GitHub Actions
test-fedora-42-python-3 GitHub Actions
test-ubuntu-22.04-python-3 GitHub Actions
test-ubuntu-22.04-python-313-freethreading GitHub Actions
test-ubuntu-24.04-python-3 GitHub Actions

github-actions[bot] avatar Dec 12 '25 22:12 github-actions[bot]

Rebased to retrigger the tests. https://github.com/apache/arrow/pull/48449#issuecomment-3648425163 is still valid.

test-conda-python-3.11-hypothesis passed

HyukjinKwon avatar Dec 12 '25 22:12 HyukjinKwon

Seems like:

tests/test_extension_type.py .................                           [ 40%]
Fatal Python error: Segmentation fault

Current thread 0x0000000203059040 (most recent call first):
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pyarrow/tests/test_fs.py", line 1224 in test_s3_options
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/python.py", line 166 in pytest_pyfunc_call
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_callers.py", line 121 in _multicall
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_hooks.py", line 512 in __call__
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/python.py", line 1720 in runtest
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/runner.py", line 179 in pytest_runtest_call
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_callers.py", line 121 in _multicall
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_hooks.py", line 512 in __call__
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/runner.py", line 245 in <lambda>
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/runner.py", line 353 in from_call
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/runner.py", line 244 in call_and_report
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/runner.py", line 137 in runtestprotocol
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/runner.py", line 118 in pytest_runtest_protocol
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_callers.py", line 121 in _multicall
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_hooks.py", line 512 in __call__
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/main.py", line 396 in pytest_runtestloop
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_callers.py", line 121 in _multicall
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_hooks.py", line 512 in __call__
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/main.py", line 372 in _main
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/main.py", line 318 in wrap_session
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/main.py", line 365 in pytest_cmdline_main
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_callers.py", line 121 in _multicall
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pluggy/_hooks.py", line 512 in __call__
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/config/__init__.py", line 199 in main
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/config/__init__.py", line 223 in console_main
  File "/Users/runner/hostedtoolcache/Python/3.11.9/arm64/bin/pytest", line 7 in <module>
tests/test_fs.py ....sssx.xsss....sssx.xssss

Failure at MacOS is globally happening. I retriggered but still the issue persists. Let me leave it as is for now - it won't be related to my change in any event.

HyukjinKwon avatar Dec 12 '25 23:12 HyukjinKwon

Yes, macOS failures are not related. The cuda extended build seems strange - not related, I think.

Thanks for all the work @HyukjinKwon! Will wait for another pair of 👀 @kszucs in case you have time for one quick strategies review.

AlenkaF avatar Dec 16 '25 07:12 AlenkaF