pandas-stubs
pandas-stubs copied to clipboard
Mypy complains about inconsistent MRO when using `isinstance()` on a generic `Series` object
To Reproduce
import pandas as pd
x: pd.Series[int] = pd.Series([1, 3, 4], name="my series")
assert isinstance(x, pd.Series) # this line produces the error below
mypy gives the following error:
repro.py:4: error: Subclass of "Series[int]" and "TimestampSeries" cannot exist: would have inconsistent method resolution order [unreachable]
pyright does not complain. Nor does mypy complain if I drop the [int]
for pd.Series
.
Please complete the following information:
- OS: Linux
- OS Version: Ubuntu 22.04
- python version: 3.10.4
- version of type checker: mypy 0.971
- version of installed
pandas-stubs
: 1.4.3.220829
I don't think we can fix this. The code you wrote won't execute because pandas
doesn't treat Series
as generic. So trying to fix an issue for type checking with respect to code that won't execute doesn't seem like a good use of time.
We're using the generic form of Series
to be able to limit certain operations (e.g., adding two series consisting of timestamps). But you can't declare something to be of type Series[sometype]
due to differences between pandas and the stubs.
I have code like that in my code base and it executes perfectly fine:
Python 3.10.4 (main, Jun 29 2022, 12:14:53) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> def f():
... x: pd.Series[int] = pd.Series([1, 3, 4], name="my series")
... assert isinstance(x, pd.Series)
... return x
...
>>> f()
0 1
1 3
2 4
Name: my series, dtype: int64
Variable annotations aren't executed within functions so this definitely works.
Here is a reproducer without the type annotation:
import pandas as pd
x = pd.Series({0: 0, 1: 1}, name="my series", dtype=int)
assert isinstance(x, pd.Series)
mypy:
% mypy repro.py
repro.py:4: error: Subclass of "Series[int]" and "TimestampSeries" cannot exist: would have inconsistent method resolution order [unreachable]
Found 1 error in 1 file (checked 1 source file)
Thanks for the latter example. We'll look into it.
@thomkeh I tried this with pandas-stubs 1.4.4.220919 and cannot reproduce the latter example. Not sure what we have changed since then, so can you try with that version and see if it still happens?
Also, double check your mypy version - I was using 0.971 (which you said above)
I checked again and I noticed that I can only reproduce it with the --warn-unreachable
flag:
mypy --warn-unreachable pandas_test.py
I just did it on mypy 0.981 and with pandas stub 1.5.0.220926.
I am guessing this is some kind of mypy
bug. We test with warn-unreachable=False
, because when using mypy
on the pandas source, it produces some false positives.
It's pretty odd that it would report an error in your code related to what is inside the stubs.
@thomkeh did you find any resolution for this issue?
I am also having the same problem.
No idea if the issue is within mypy or pandas-stubs, but I agree that my code executes fine and it type hints with no issues if I just use a Series
rather than a Series[float]
in my case, so long as I have a type: ignore[type-arg]
I also found that if I have something like:
result = input.loc[x] if isinstance(input, Series) else input
results in a Item "float" of "Union[float, Series[float]]" has no attribute "loc"
which it shouldn't do given that it only runs that code if input
is a Series.
@jonyscathe I don't have a solution. My next step was going to be to open an issue in the mypy repository but I wanted to have a smaller reproducing code snippet first.
I found a standalone reproducer:
from __future__ import annotations
import datetime
from typing import Any, Generic, TypeVar, Union, overload
S1 = TypeVar("S1", int, datetime.datetime)
class Series(Generic[S1]):
@overload
def __new__(cls, data: datetime.datetime) -> TimestampSeries: ...
@overload
def __new__(cls, data: dict[int, S1]) -> Series[S1]: ...
def __new__(cls, data: Union[datetime.datetime, dict[int, S1]]) -> Any:
return
class TimestampSeries(Series[datetime.datetime]): ...
x = Series({0: 0, 1: 1})
reveal_type(x)
assert isinstance(x, Series)
with mypy --warn-unreachable
:
minimal_mro_problem.py:20: note: Revealed type is "minimal_mro_problem.Series[builtins.int]"
minimal_mro_problem.py:21: error: Subclass of "Series[int]" and "TimestampSeries" cannot exist: would have inconsistent method resolution order [unreachable]
I reported it here: https://github.com/python/mypy/issues/13824
Bug still exists in mypy 0.990
This seems to work now with mypy 1.4.1 and the latest pandas-stubs :)
Hmm, I can still reproduce:
tmke8@ubuntu:~$ pip install git+https://github.com/pandas-dev/pandas-stubs.git
Collecting git+https://github.com/pandas-dev/pandas-stubs.git
Cloning https://github.com/pandas-dev/pandas-stubs.git to /tmp/pip-req-build-fb3nmjwl
Running command git clone --filter=blob:none --quiet https://github.com/pandas-dev/pandas-stubs.git /tmp/pip-req-build-fb3nmjwl
Resolved https://github.com/pandas-dev/pandas-stubs.git to commit fbec52bbff022384bd30bd69dcda776c22d19729
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: types-pytz>=2022.1.1 in /home/tmk/.cache/pypoetry/virtualenvs/ethicml-dzQunYke-py3.10/lib/python3.10/site-packages (from pandas-stubs==2.0.2.230605) (2022.1.2)
Requirement already satisfied: numpy>=1.25.0 in /home/tmk/.cache/pypoetry/virtualenvs/ethicml-dzQunYke-py3.10/lib/python3.10/site-packages (from pandas-stubs==2.0.2.230605) (1.25.2)
Building wheels for collected packages: pandas-stubs
Building wheel for pandas-stubs (pyproject.toml) ... done
Created wheel for pandas-stubs: filename=pandas_stubs-2.0.2.230605-py3-none-any.whl size=151715 sha256=43e3e6baddaa211ea09637e8eedfb60fe57324b41f0b16cbf0a0108715618276
Stored in directory: /tmp/pip-ephem-wheel-cache-suw3twnj/wheels/88/ba/da/a34e583c952d4fc1cf67b3763fc7c19b34a58ad569ab1aa6e6
Successfully built pandas-stubs
Installing collected packages: pandas-stubs
Successfully installed pandas-stubs-2.0.2.230605
tmke8@ubuntu:~$ cat stub_bug.py
import pandas as pd
x = pd.Series({0: 0, 1: 1}, name="my series", dtype=int)
assert isinstance(x, pd.Series)
tmke8@ubuntu:~$ mypy --version
mypy 1.4.1 (compiled: yes)
tmke8@ubuntu:~$ mypy --warn-unreachable stub_bug.py
stub_bug.py:4: error: Subclass of "Series[int]" and "TimestampSeries" cannot exist: would have inconsistent method resolution order [unreachable]
Found 1 error in 1 file (checked 1 source file)
Or does pip install git+https://github.com/pandas-dev/pandas-stubs.git
not give me the most recent version?
You are right - I tested it without --warn-unreachable