typing icon indicating copy to clipboard operation
typing copied to clipboard

`Counter.__init__` can't be typed safely

Open beauxq opened this issue 1 year ago • 8 comments

Counter overloads are unsafe

from collections import Counter

# no type errors reported and nothing tricky going on
a = {"a": "hello"}
b = Counter(a)
c = b.get("a", 0)
print(c + 2)  # crash

and I think there isn't a way to express them safely in the current type system. Iterable[_T] needs to be something like Iterable[_T] - Mapping[_T, ~int]

If type checkers want to address this now, it would probably have to be a special case.

Without having to worry about type intersections/differences/negations, maybe we could have a way to indicate: "If this overload matches (after all previous overloads fail), a type error should be reported."

something like:

    @overload
    def __init__(self, mapping: Mapping[_T, Any], /) -> typing.Error: ...

beauxq avatar Aug 23 '24 14:08 beauxq

Perhaps this is because in general, dicts are not type safe?

Does it cause a performance hit to make a StrictCounter subclass that can only have integer values?

In my opinion, the goal of Python typing is not to type everything. It's a tool, to help you avoid bug prone dynamic or none type-safe code.

Therefore by not easily typing this example (without some hack or other), the type system is working as intended. By only letting this code pass typechecking -- by not adding any type hints at all to it -- the type system has highlighted to you the author, a part of your code that is prone to bugs - assigning arbitrary key values to a Counter. Perhaps this would be accomplished more smoothly and more easily understandable, via something like a version of the Coverage tool for testing, but for type hints.

It seems that historically, it was decided to make Counter a dict subclass. So Counter must support adding arbitrary none-counter like values for arbitrary hashable keys.

If you disagree then please could you expand on what the intention is behind adding non-integer values to a key/value item in a Counter? And if there is a good use case, what need is there to make such a dynamic use case type safe (or a use case that's only superficially, loosely type-able - e.g. making eveyrthing Any)?

JamesParrott avatar Aug 23 '24 17:08 JamesParrott

By only letting this code pass typechecking -- by not adding any type hints at all to it

Code does not need type annotations in order to be fully and completely typed. There is no Any or unknown typing in this example.

You can add type annotations and it doesn't change anything:

a: dict[str, str] = {"a": "hello"}
b: Counter[str] = Counter(a)
c: int = b.get("a", 0)
print(c + 2)  # crash

The type system does not allow adding strings to the values of a Counter. You can add a line b["z"] = "hello" and the type checker will report an error.

beauxq avatar Aug 23 '24 18:08 beauxq

Well OK, if there's already special code in type checkers to support Counter.get, a correct description of that needs to have the same return type as dict.get. Else it must account for Counter.__init__ and all previous calls to mutating methods inherited from dict (Counter.__setitem__ et al.).

If there's a desire to implement that by people that want to, then good for them. Have at it,

Otherwise it's reasonable to conclude this crash is due to Python's dynamic typing, and can't be checked for without simply trying it at run time. Python and its type system provides users that want to initialise a Counter[str] from a dict[str,str] with plenty of other tools that could fix this for themselves, e.g. making their own better-typed interface layer.

JamesParrott avatar Aug 23 '24 19:08 JamesParrott

If there's a desire to implement that by people that want to, then good for them. Have at it,

I don't know what you're suggesting that anyone implement here. Counter.get already matches dict.get and it already accounts for Counter.__init__ and all calls to mutating methods.

The problem is that the typeshed typing for Counter.__init__ is wrong. And that's not a problem with typeshed, because correct typing isn't possible. That's why it's a problem with python/typing, and not python/typeshed.

beauxq avatar Aug 23 '24 20:08 beauxq

@beauxq can i try working on this issue?

Maryam3107 avatar Apr 08 '25 12:04 Maryam3107

@beauxq can i try working on this issue?

Anyone can try working on it. I'm not sure why you're asking me. But I am curious to hear what you have in mind.

beauxq avatar Apr 08 '25 14:04 beauxq

Counter overloads are unsafe

from collections import Counter

no type errors reported and nothing tricky going on

a = {"a": "hello"} b = Counter(a) c = b.get("a", 0) print(c + 2) # crash and I think there isn't a way to express them safely in the current type system. Iterable[_T] needs to be something like Iterable[_T] - Mapping[_T, ~int]

If type checkers want to address this now, it would probably have to be a special case.

Without having to worry about type intersections/differences/negations, maybe we could have a way to indicate: "If this overload matches (after all previous overloads fail), a type error should be reported."

something like:

@overload
def __init__(self, mapping: Mapping[_T, Any], /) -> typing.Error: ...

@beauxq what's about adding a new decorator or syntax so that it should report a type error if it matches?

Maryam3107 avatar Apr 08 '25 14:04 Maryam3107

@JamesParrott @tony @beauxq @jstasiak @djsutherland Hi! what would be the project size and preferred topics?

Maryam3107 avatar Apr 08 '25 16:04 Maryam3107