mypy icon indicating copy to clipboard operation
mypy copied to clipboard

Add type inference for `dict.keys` membership

Open matthewhughes934 opened this issue 3 years ago • 11 comments

  • Add type inference for dict.keys membership

    Also for containership checks on typing.KeysView. This is to bring the following cases into alignment:

      from typing import KeysView
    
      key: str | None
      d: dict[str, int]
      kv: KeysView[str]
    
      if key in d:
          # type of 'key' is inferred to be 'str'
          ...
    
      if key in d.keys():
          # type of 'key' should be inferred to be 'str'
          ...
    
      if key in kv:
          # type of 'key' should be inferred to be 'str'
          ...
    

    Before this change only the first if would narrow the type of key.

    I've just added a test under test-data/unit/pythoneval.test as test-data/unit/fixtures/dict.pyi doesn't include dict.keys, and adding it requires importing dict_keys from _collections_abc in those stubs, which then requires adding _collections_abc.pyi stubs, which would have to be complete since e.g. testGenericAliasCollectionsABCReveal expects most of the types in those stubs to be defined (this is the same approach as 7678f28f65cf6e878cacd5ba663e6e54347653ae).

    GH: issue #13360

Test Plan

Test added under test-data/unit/check-isinstance.test

matthewhughes934 avatar Aug 09 '22 19:08 matthewhughes934

Diff from mypy_primer, showing the effect of this PR on open source code:

spark (https://github.com/apache/spark)
+ python/pyspark/sql/types.py:401: error: Redundant cast to "int"  [redundant-cast]
+ python/pyspark/sql/types.py:402: error: Redundant cast to "int"  [redundant-cast]

github-actions[bot] avatar Aug 09 '22 21:08 github-actions[bot]

Diff from mypy_primer, showing the effect of this PR on open source code:

spark (https://github.com/apache/spark)
+ python/pyspark/sql/types.py:401: error: Redundant cast to "int"  [redundant-cast]
+ python/pyspark/sql/types.py:402: error: Redundant cast to "int"  [redundant-cast]

github-actions[bot] avatar Aug 09 '22 22:08 github-actions[bot]

With 5fd5a93 I've moved the tests under test-data/unit/pythoneval.test just to verify they would work under a full set of stubs.

I've (very roughly) documented some of the issues I ran into while trying to add stubs for these into the test libraries, but I think I'd need to add some complete (at least sufficient to satisfy the tests in mypy/test/testcheck.py::TypeCheckSuite::check-generic-alias.test::testGenericAliasCollectionsABCReveal) stubs for _collections_abc in order to have everything under the test stubs.

I assume having test stubs for dict.keys would be the preference? Happy to give it a shot but I might need to be pointed in the right direction :slightly_smiling_face:

EDIT: aaand I forgot to commit some changes, see 68b2881

matthewhughes934 avatar Aug 13 '22 16:08 matthewhughes934

Diff from mypy_primer, showing the effect of this PR on open source code:

spark (https://github.com/apache/spark)
+ python/pyspark/sql/types.py:401: error: Redundant cast to "int"  [redundant-cast]
+ python/pyspark/sql/types.py:402: error: Redundant cast to "int"  [redundant-cast]

github-actions[bot] avatar Aug 13 '22 19:08 github-actions[bot]

Diff from mypy_primer, showing the effect of this PR on open source code:

spark (https://github.com/apache/spark)
+ python/pyspark/sql/types.py:401: error: Redundant cast to "int"  [redundant-cast]
+ python/pyspark/sql/types.py:402: error: Redundant cast to "int"  [redundant-cast]

github-actions[bot] avatar Aug 13 '22 22:08 github-actions[bot]

01efd35 contains the work for _collections_abc.dict_keys and typing.KeysView.

For dict.items I think I'll need to look a bit deeper, since collections.ItemsView inherits AbstractSet[tuple[_KT_co, _VT_co]] but I don't think mypy knows how to handle narrowing when tuple unpacking? Running mypy over the following on `master:

from typing import Dict, Optional, Set, Tuple

d: Dict[str, int]
s: Set[Tuple[str, int]]

v: Optional[int]
k: Optional[str]
p: Optional[Tuple[str, int]]

if (k, v) in d.items():
    reveal_type(k)
    reveal_type(v)

if (k, v) in s:
    reveal_type(k)
    reveal_type(v)

if p in s:
    reveal_type(p)

gives:

script.py:11: note: Revealed type is "Union[builtins.str, None]"
script.py:12: note: Revealed type is "Union[builtins.int, None]"
script.py:15: note: Revealed type is "Union[builtins.str, None]"
script.py:16: note: Revealed type is "Union[builtins.int, None]"
script.py:19: note: Revealed type is "Tuple[builtins.str, builtins.int]"

Where I'd expect none of those to be optional.

matthewhughes934 avatar Aug 16 '22 17:08 matthewhughes934

Diff from mypy_primer, showing the effect of this PR on open source code:

spark (https://github.com/apache/spark)
+ python/pyspark/sql/types.py:401: error: Redundant cast to "int"  [redundant-cast]
+ python/pyspark/sql/types.py:402: error: Redundant cast to "int"  [redundant-cast]

github-actions[bot] avatar Aug 16 '22 19:08 github-actions[bot]

@sobolevn any objections to keeping this changeset just for dict.keys? I'm thinking it might be worth having a separate discussion for .items and .values (see my comment above)

matthewhughes934 avatar Aug 23 '22 16:08 matthewhughes934

I'd be OK with supporting only .keys(); narrowing on .items() and .values() doesn't make nearly as much sense to me. I don't see any mention of those in the issue either.

JelleZijlstra avatar Aug 23 '22 16:08 JelleZijlstra

+1 to @JelleZijlstra 🙂

sobolevn avatar Aug 23 '22 16:08 sobolevn

Diff from mypy_primer, showing the effect of this PR on open source code:

spark (https://github.com/apache/spark)
+ python/pyspark/sql/types.py:401: error: Redundant cast to "int"  [redundant-cast]
+ python/pyspark/sql/types.py:402: error: Redundant cast to "int"  [redundant-cast]

github-actions[bot] avatar Aug 23 '22 17:08 github-actions[bot]