typeshed icon indicating copy to clipboard operation
typeshed copied to clipboard

Should the `dict.get` overload with default value not be generic?

Open Avasam opened this issue 3 years ago • 3 comments
trafficstars

When using dict.get with a default value, it is expected that the key may not be found. Most of the time it's because we're not dealing with literal types, so we can't ensure that a key is part of the dict. But other times, it's we explicitely use a value that has a type we know won't be in range (often None).

See the example below (it is of course minimal and abstracted from a real-world example, but should still show the use):

code_events = {
  0: "ZERO_EVENT",
  1: "ONE_EVENT",
  # ...
  127: "LAST_EVENT",
}

def parse_foo_code(self, code: int | None):
  code  = code and (code & 0x7f)  # Some bitwise operation to skip bit 8, this just shows we can't use -1 as default. And who knows what other keys could be used in the dict
  event = code_events.get(code, "SPECIAL_DEFAULT_FOO_EVENT")
  # ... do more with the event
  
def parse_bar_code(self, code: int | None):
  code  = code and (code & 0x7f)  # Some bitwise operation to skip bit 8
  event = code_events.get(code, "SPECIAL_DEFAULT_BAR_EVENT")
  # ... do more with the event

Currently it has to be written as such (without a cast). Which adds redundancy checks and no type-safety:

def parse_foo_code(self, code: int | None):
  code  = code and (code & 0x7f)  # Some bitwise operation to skip bit 8
  if isinstance(code, int):  # You could imagine this could be more complex than just int. Maybe it can't even be easily expressed
    event = code_events.get(code, "SPECIAL_DEFAULT_FOO_EVENT")
  else:
    event = "SPECIAL_DEFAULT_FOO_EVENT"
  # ... do more with the event

What if this definition in dict:

@overload
def get(self, __key: _KT) -> _VT | None: ...
@overload
def get(self, __key: _KT, __default: _VT | _T) -> _VT | _T: ...

was changed to:

@overload
def get(self, __key: _KT) -> _VT | None: ...
@overload
def get(self, __key: object, __default: _VT | _T) -> _VT | _T: ...  # Either object,  or Any if it causes variance issue.

What are your thoughts? Any reason this would be a bad idea? From what I can see there are legit sensible use-cases, it stays type-safe, and allows for more concise code. Or just use a cast and call it a day?

Avasam avatar Nov 10 '22 22:11 Avasam

Another pyright user here and +1 on the proposal.

I would like it for the first overload (no default argument) as well. I've run into cases where the generic type causes trouble even when I am not passing a default argument.

It also seems more consistent. I don't see why it would be correct to do this in the second overload but not the first.

I don't remember a case where this generic type helped find a bug. But it has been inconvenient. I'll post back if I find one!

Jeremiah-England avatar Jul 05 '23 15:07 Jeremiah-England

I think a good middle ground to try first would be to change the __key in what is not the second and third overload to _KT | None. This should cover many typical cases without any significant type safety cost. That said, using object could possibly work, too. Exploratory PRs would be welcome to judge the impact.

srittau avatar Sep 02 '24 20:09 srittau

I like the _KT | None idea, and IMO it is much better than using object. If you do dict.get(totally_wrong_type, "fallback"), that really should be an error. That is kinda the point of type checking: if any parameter is of the wrong type, you get an error :)

Exploratory PRs would be welcome to judge the impact.

I don't think we can just rely on mypy_primer. Most of the open-source code that mypy_primer looks at is not buggy, so it won't contain many instances of dict.get(totally_wrong_type, "fallback"), which is the thing we care about. That said, mypy_primer will tell whether object is significantly more convenient than _KT | None.

Akuli avatar Sep 05 '24 18:09 Akuli

I ran into this today. My code is basically this:

messages: dict[str, str] = {"foo": "bar"}

def get_message(key: str | None) -> str | None:
    return messages.get(key)

print(get_message("foo"))  # bar
print(get_message("not found"))  # None
print(get_message(None))  # None

Works great at runtime, does not type-check.

Akuli avatar Jul 25 '25 09:07 Akuli