polars
polars copied to clipboard
fix(python): use explicit type-arg for PythonDataType
Bumped into this issue trying to use pl.from_dicts
. Type checkers in strict modes seem to not like SchemaDefinition
-> PythonDataType
using list
and tuple
without explicit args. I believe passing Any
is essentially the default behavior. This just makes that explicit.
Edit
Switched to using typing.List
and typing.Tuple
for Python 3.7 compatible. This file also already contains a List
class used for a different purpose.
Switched to using
typing.List
andtyping.Tuple
for Python 3.7 compatible. This file also already contains aList
class used for a different purpose.
This shouldn't be necessary, given that from __future__ import annotations
is present?
This shouldn't be necessary, given that
from __future__ import annotations
is present?
I tried at first, but CI failed with mypy erroring:
polars/datatypes.py:82: error: "list" is not subscriptable [misc]
polars/datatypes.py:83: error: "tuple" is not subscriptable [misc]
https://github.com/pola-rs/polars/actions/runs/4021457727/jobs/6910343984
I'm having trouble reproducing your issue. Could you provide a minimum working example where the type checker fails?
I'm having trouble reproducing your issue. Could you provide a minimum working example where the type checker fails?
Ah, I should I mentioned I'm using pyright not mypy.
# pyright: strict
import polars as pl
pl.from_dict({"a": [1, 2], "b": [3, 4]})
$ pyright example.py
example.py:3:1 - error: Type of "from_dict" is partially unknown
Type of "from_dict" is "(data: Mapping[str, Sequence[object] | Mapping[str, Sequence[object]] | Series], schema: Sequence[str] | Mapping[str, DataTypeClass | DataType | Type[int] | Type[float] | Type[bool] | Type[str] | Type[date] | Type[time] | Type[datetime] | Type[timedelta] | Type[list[Unknown]] | Type[tuple[Unknown, ...]] | Type[bytes] | Type[Decimal]] | Sequence[str | Tuple[str, DataTypeClass | DataType | Type[int] | Type[float] | Type[bool] | Type[str] | Type[date] | Type[time] | Type[datetime] | Type[timedelta] | Type[list[Unknown]] | Type[tuple[Unknown, ...]] | Type[bytes] | Type[Decimal] | None]] | None = None, *, schema_overrides: Mapping[str, DataTypeClass | DataType] | None = None) -> DataFrame" (reportUnknownMemberType)
1 error, 0 warnings, 0 informations
More isolated example:
# pyright: strict
import polars.datatypes
polars.datatypes.PythonDataType
$ pyright example.py
example.py:4:1 - error: Type of "PythonDataType" is partially unknown
Type of "PythonDataType" is "Type[Type[int]] | Type[Type[float]] | Type[Type[bool]] | Type[Type[str]] | Type[Type[date]] | Type[Type[time]] | Type[Type[datetime]] | Type[Type[timedelta]] | Type[Type[list[Unknown]]] | Type[Type[tuple[Unknown, ...]]] | Type[Type[bytes]] | Type[Type[Decimal]]" (reportUnknownMemberType)
Sorry to leave this hanging for a while. I had to verify any false positives/negatives this could reproduce, as I knew I had looked at this before.
Since the type is object, I thought it might simply allow all objects now, but that is not the case:
Nested: TypeAlias = Union[
Type[ListType[Any]],
Type[Tuple[Any, ...]],
]
type_map: dict[Nested, str] = {
list: "List!",
tuple: "Tuple!",
dict: "Dict!", # error: Dict entry 2 has incompatible type "Type[int]": "str"; expected "Union[Type[List[Any]], Type[Tuple[Any, ...]]]": "str" [dict-item]
}
So this solves a type: ignore
comment and apparently pleases the pyright
type checker. That's good enough for me, let's merge this!
Thanks @josh !
EDIT: As an aside:
Switched to using
typing.List
andtyping.Tuple
for Python 3.7 compatible. This file also already contains aList
class used for a different purpose.This shouldn't be necessary, given that
from __future__ import annotations
is present?
Type aliases are not affected in the same way as actual type annotations on functions/methods by the future import. I couldn't tell you the details, but I know I've bumped into it before. I believe you can work around this by using strings, but I believe this is more readable in this case.