typeshed
typeshed copied to clipboard
Should the `dict.get` overload with default value not be generic?
When using dict.get with a default value, it is expected that the key may not be found. Most of the time it's because we're not dealing with literal types, so we can't ensure that a key is part of the dict. But other times, it's we explicitely use a value that has a type we know won't be in range (often None).
See the example below (it is of course minimal and abstracted from a real-world example, but should still show the use):
code_events = {
0: "ZERO_EVENT",
1: "ONE_EVENT",
# ...
127: "LAST_EVENT",
}
def parse_foo_code(self, code: int | None):
code = code and (code & 0x7f) # Some bitwise operation to skip bit 8, this just shows we can't use -1 as default. And who knows what other keys could be used in the dict
event = code_events.get(code, "SPECIAL_DEFAULT_FOO_EVENT")
# ... do more with the event
def parse_bar_code(self, code: int | None):
code = code and (code & 0x7f) # Some bitwise operation to skip bit 8
event = code_events.get(code, "SPECIAL_DEFAULT_BAR_EVENT")
# ... do more with the event
Currently it has to be written as such (without a cast). Which adds redundancy checks and no type-safety:
def parse_foo_code(self, code: int | None):
code = code and (code & 0x7f) # Some bitwise operation to skip bit 8
if isinstance(code, int): # You could imagine this could be more complex than just int. Maybe it can't even be easily expressed
event = code_events.get(code, "SPECIAL_DEFAULT_FOO_EVENT")
else:
event = "SPECIAL_DEFAULT_FOO_EVENT"
# ... do more with the event
What if this definition in dict:
@overload
def get(self, __key: _KT) -> _VT | None: ...
@overload
def get(self, __key: _KT, __default: _VT | _T) -> _VT | _T: ...
was changed to:
@overload
def get(self, __key: _KT) -> _VT | None: ...
@overload
def get(self, __key: object, __default: _VT | _T) -> _VT | _T: ... # Either object, or Any if it causes variance issue.
What are your thoughts? Any reason this would be a bad idea? From what I can see there are legit sensible use-cases, it stays type-safe, and allows for more concise code. Or just use a cast and call it a day?
Another pyright user here and +1 on the proposal.
I would like it for the first overload (no default argument) as well. I've run into cases where the generic type causes trouble even when I am not passing a default argument.
It also seems more consistent. I don't see why it would be correct to do this in the second overload but not the first.
I don't remember a case where this generic type helped find a bug. But it has been inconvenient. I'll post back if I find one!
I think a good middle ground to try first would be to change the __key in what is not the second and third overload to _KT | None. This should cover many typical cases without any significant type safety cost. That said, using object could possibly work, too. Exploratory PRs would be welcome to judge the impact.
I like the _KT | None idea, and IMO it is much better than using object. If you do dict.get(totally_wrong_type, "fallback"), that really should be an error. That is kinda the point of type checking: if any parameter is of the wrong type, you get an error :)
Exploratory PRs would be welcome to judge the impact.
I don't think we can just rely on mypy_primer. Most of the open-source code that mypy_primer looks at is not buggy, so it won't contain many instances of dict.get(totally_wrong_type, "fallback"), which is the thing we care about. That said, mypy_primer will tell whether object is significantly more convenient than _KT | None.
I ran into this today. My code is basically this:
messages: dict[str, str] = {"foo": "bar"}
def get_message(key: str | None) -> str | None:
return messages.get(key)
print(get_message("foo")) # bar
print(get_message("not found")) # None
print(get_message(None)) # None
Works great at runtime, does not type-check.