arrow icon indicating copy to clipboard operation
arrow copied to clipboard

[Python] Missing test cases in all_array_types

Open deanm0000 opened this issue 1 year ago • 6 comments

Describe the enhancement requested

I think there should be negative values in the floats and signed ints and also is there a reason to exclude pa.decimal128 and pa.decimal256?

In reference to https://github.com/apache/arrow/blob/c557fe51b3763b9492392f48c5ebcae5a1dd0b42/python/pyarrow/tests/test_compute.py#L72-L91

Component(s)

Python

deanm0000 avatar Sep 25 '24 20:09 deanm0000

take

Isaac7777-cpu avatar Oct 09 '24 02:10 Isaac7777-cpu

I am new to open source project contribution. I will try me best to fix this issue as part of my course.

Isaac7777-cpu avatar Oct 12 '24 23:10 Isaac7777-cpu

I have seen that the script has been consistently referring to some fuctionalities of compute.py such as sum and list_slice. However, I cannot find them in compute.py. Am I missing anything?

Isaac7777-cpu avatar Oct 13 '24 10:10 Isaac7777-cpu

They're dynamically loaded from the c library. See this function https://github.com/apache/arrow/blob/8a7224d21fb7ac1938cb039cc6dcfd38db94519e/python/pyarrow/compute.py#L307

deanm0000 avatar Oct 13 '24 14:10 deanm0000

I see! I have been following along with the first PR contribution tutorial. However, testing in the local repository is still not working because of unrecognised module.

pushd arrow/python
python -m pytest pyarrow
popd
~/.../arrow/python ~/.../arrow_project
ImportError while loading conftest '/Users/.../arrow/python/pyarrow/conftest.py'.
pyarrow/__init__.py:65: in <module>
    import pyarrow.lib as _lib
E   ImportError: dlopen(/.../arrow/python/pyarrow/lib.cpython-310-darwin.so, 0x0002): symbol not found in flat namespace '__ZN3re212re2_internal5ParseINSt3__117basic_string_viewIcNS2_11char_traitsIcEEEEEEbPKcmPT_'
~/Codes/Uni/COMP2120/arrow_project

Isaac7777-cpu avatar Oct 14 '24 02:10 Isaac7777-cpu

Also, I would like to ask how should I incorporate pa.decimal128 and pa.decimal256. Would the following be enough?

all_array_types = [
    ('bool', [True, False, False, True, True]),
    ('uint8', range(5)),
    ('int8', range(-4, 5)),
    ('uint16', range(5)),
    ('int16', range(-4, 5)),
    ('uint32', range(5)),
    ('int32', range(-4, 5)),
    ('uint64', range(5, 10)),
    ('int64', range(-9, 10)),
    ('float', [0, 0.1, 0.2, 0.3, 0.4, -0.1, -0.2, -0.3, -0.4]),
    ('double', [0, 0.1, 0.2, 0.3, 0.4, -0.1, -0.2, -0.3, -0.4]),
    ('string', ['a', 'b', None, 'ddd', 'ee']),
    ('binary', [b'a', b'b', b'c', b'ddd', b'ee']),
    (pa.binary(3), [b'abc', b'bcd', b'cde', b'def', b'efg']),
    (pa.list_(pa.int8()), [[-1, 2], [-3, 4], [5, 6], None, [9, 16]]),
    (pa.large_list(pa.int16()), [[1], [-2, -3, 4], [5, 6], None, [9, 16]]),
    (pa.struct([('a', pa.int8()), ('b', pa.int8())]), [
        {'a': 1, 'b': 2}, None, {'a': -3, 'b': -4}, None, {'a': 5, 'b': 6}]),
    (pa.decimal128(0.01), [-0.1, 0, 0.3]),
    (pa.decimal256(0.01), [-0.1, 0, 0.3]),
]

Isaac7777-cpu avatar Oct 14 '24 03:10 Isaac7777-cpu

This issue has been marked as stale because it has had no activity in the past 365 days. Please remove the stale label or comment below, or this issue will be closed in 14 days. If this improvement is still desired but has no current owner, please add the 'Status: needs champion' label.

thisisnic avatar Nov 18 '25 12:11 thisisnic