pandas-stubs
pandas-stubs copied to clipboard
Implement ExtensionArray _accumulate and _reduce
Describe the bug
The stubs for ExtensionArray (in pandas-stubs/core/arrays/base.pyi
) does not provide type signatures for _accumulate and _reduce. To properly add typing information to the Pint-Pandas project, these need to be defined.
To Reproduce
- Minimal Runnable Example:
import numpy as np
import pandas as pd
from typing import reveal_type
from pandas.arrays import IntegerArray
from pandas.api.extensions import ExtensionArray
_data: ExtensionArray = IntegerArray(values=np.array([1, 2, 3], dtype=int), mask=np.array([True, True, True], dtype=bool))
if isinstance(_data, ExtensionArray):
reveal_type(_data)
reveal_type(_data._accumulate)
reveal_type(_data._reduce)
- Using
mypy
- Show the error message received from that type checker while checking your example.
(pint-dev) % pre-commit run mypy --files foo.py
mypy.....................................................................Failed
- hook id: mypy
- duration: 1.41s
- exit code: 1
foo.py:9: note: Revealed type is "pandas.core.arrays.base.ExtensionArray"
foo.py:10: error: "ExtensionArray" has no attribute "_accumulate" [attr-defined]
foo.py:10: note: Revealed type is "Any"
foo.py:11: error: "ExtensionArray" has no attribute "_reduce" [attr-defined]
foo.py:11: note: Revealed type is "Any"
Found 2 errors in 1 file (checked 1 source file)
Note that running the script in python works, because it uses actual Pandas code, not Pandas-Stubs:
(pint-dev) % python foo.py
Runtime type is 'IntegerArray'
Runtime type is 'method'
Runtime type is 'method'
Please complete the following information:
- OS: Mac OS
- OS Version 14.1.2
- python 3.11.4
- mypy 1.8.0
- version of installed
pandas-stubs
: 2.1.4.231227
Additional context Add any other context about the problem here.
While they look very much private, they are documented: https://pandas.pydata.org/docs/reference/api/pandas.api.extensions.ExtensionArray._accumulate.html https://pandas.pydata.org/docs/reference/api/pandas.api.extensions.ExtensionArray._reduce.html and could therefore probably be added to pandas-stubs? @Dr-Irv
While they look very much private, they are documented: https://pandas.pydata.org/docs/reference/api/pandas.api.extensions.ExtensionArray._accumulate.html https://pandas.pydata.org/docs/reference/api/pandas.api.extensions.ExtensionArray._reduce.html and could therefore probably be added to pandas-stubs? @Dr-Irv
Agreed. PR with tests welcome
I'm glad to see its a simple case, but alas, it's just beyond my level of python
and mypy
type algebras.
I can not see any ExtensionArray
specific test. @Dr-Irv can you advise on where they should be located ?
I can not see any
ExtensionArray
specific test. @Dr-Irv can you advise on where they should be located ?
I would add something to test_extension.py
, but you can just add a test that asserts the types of _reduce()
and _accumulate()
to be Callable
with appropriate arguments and return types.
I have added in core/arrays/base.pyi
def _reduce(self, name: str, *, skipna: bool=..., keepdims: bool=... , **kwargs) -> Scalar: ...
def _accumulate(self, name: str, *, skipna: bool=..., **kwargs) -> Self: ...
But now I am facing issues with tests:
- I am struggling
assert-type
ofCallable
with multiple arguments (including optional ones and kwargs). I can not find exemples where this is tested. mypy and pyright looks like to deals with arguments in slightly different ways.- mypy:
error: Expression is of type "Callable[[str, DefaultNamedArg(bool, 'skipna'), DefaultNamedArg(bool, 'keepdims'), KwArg(Any)], str | bytes | date | datetime | timedelta | datetime64 | timedelta64 | bool | int | float | Timestamp | Timedelta | complex]", not "Callable[[], str | bytes | date | datetime | timedelta | datetime64 | timedelta64 | bool | int | float | Timestamp | Timedelta | complex]" [assert-type]
- mypy:
- pyright also complains about https://github.com/pandas-dev/pandas-stubs/blob/feebd4707e594fd1ba7d30fd54e38885872d5ddd/tests/extension/decimal/array.py#L248 where
_reduce
is also defined for sub class of ExtensionArray
not sure about the good first issue
tag 😃
I had another recent case in dealing with Callable
with odd arguments, and I think it will be hard to do the assert_type()
based on what I've learned.
I'm fine if we don't include a test for this, and just add the declarations for the 2 functions.
As for the _reduce()
issue with pyright
, for extension arrays, the _reduce()
operation could return an object of the dtype of the extension array, which could be anything, so use this instead:
def _reduce(self, name: str, *, skipna: bool=..., keepdims: bool=... , **kwargs) -> object: ...
You may have to change tests/extension/decimal/array.py
to return decimal.Decimal
for _reduce()
in there.
Agree this is not a good first issue any more, but I think you can do it!