pytype icon indicating copy to clipboard operation
pytype copied to clipboard

Instance attribute container type information is wrong

Open goncalopp opened this issue 3 years ago • 2 comments

The type information for instance attributes seem to be wrong/lost/not stored in some circumstances. Here's an illustrative example, run with pytype tmp.py --check-attribute-types --check-container-types --check-variable-types

from typing import List

class Foo:
    def __init__(self):
        l1 : List[str] = []
        self.l2 : List[str] = []
        self.abc : str = "a"
        l1.append(1)
        self.l2.append(1)
        print(self.abc + 1)

Expectation: Errors raised on l1.append(1), self.l2.append(1) and print(self.abc + 1) lines.

Actual result: Errors raised on l1.append(1), and print(self.abc + 1) lines. self.l2.append(1) does not raise an error.

Output:

File "/home/user/tmp.py", line 8, in __init__: New container type for l1 does not match type annotation [container-type-mismatch]
  Annotation: List[str] (type parameters List[_T])
  Contained types:
    _T: str
  New contained types:
    _T: Union[int, str]
  
File "/home/user/tmp.py", line 10, in __init__: unsupported operand type(s) for +: 'str' and 'int' [unsupported-operands]
  Function __add__ on str expects str

For more details, see https://google.github.io/pytype/errors.html.
ninja: build stopped: subcommand failed.
Leaving directory '/home/user/.pytype'

Generated pyi:

class Foo:
    abc: str
    l2: List[int]
    def __init__(self) -> None: ...

Note how l2 has type List[int] instead of List[str]

Version: master (35fb32dc6e4c388bd57f01bfee9625fdca826be0)

Verbose output of pytype-single tmp.py --check-attribute-types --check-container-types --check-variable-types -v 3: https://pastebin.com/5kiiadbT

Interesting bit: INFO:pytype.function Mutating _T to Union[_T, _T2]. This results in Added error to log: container-type-mismatch in the l1 case, but not l2 (as far as I can tell)

goncalopp avatar Oct 15 '20 15:10 goncalopp

I've isolated this a bit more and got lower-level logging. The difference between l1 and l2 append calls is this:

l1:

abstract._log_args Arg 0
abstract._log_args   <list [ParameterizedClass(cls=PyTDClass(list) params={'_T': PyTDClass(str)})]> [var 480]
abstract._log_args     429:
abstract._log_args       <str [PyTDClass(str)]> [var 429]
abstract._log_args Arg 1
abstract._log_args   <int '1'> [var 481]
abstract._log_args     490:
abstract._log_args       <int [PyTDClass(int)]> [var 490]
abstract.match_args args in view: [<list [ParameterizedClass(cls=PyTDClass(list) params={'_T': PyTDClass(str)})]>, <int '1'>]
function.substitute_formal_args Matched arguments against sig(self: List[_T], object: _T2) -> None:
    self = List[Union[_T, _T2]]
function.substitute_formal_args param 0) self: GenericType(ClassType(__builtin__.list), (TypeParameter('_T', (), None, '__builtin__.list'),)) <=> <bind
ing of variable 480 to data 139741516363848>
function.substitute_formal_args param 1) object: TypeParameter('_T2', (), None, '__builtin__.list.append') <=> <binding of variable 481 to data 1397415
17389512>
function.substitute_formal_args Using __builtin__.list.append._T2=<Variable v494: 1 choices> [<int [PyTDClass(int)]>]
function.substitute_formal_args Using typing.Sequence._T=<Variable v493: 1 choices> [<str [PyTDClass(str)]>]
function._get_mutation Mutating _T to Union[_T, _T2]
analyze.trace_call Logging call to <binding of variable 473 to data 139741517375112> with 2 args, return <Variable v495: 1 choices>
errors._add Added error to log: container-type-mismatch
<error output here>
abstract_utils.apply_mutations Applying mutations

l2:

abstract._log_args Arg 0
abstract._log_args   <list '[]'> [var 503]
abstract._log_args Arg 1
abstract._log_args   <int '1'> [var 504]
abstract._log_args     490:
abstract._log_args       <int [PyTDClass(int)]> [var 490]
abstract.match_args args in view: [<list '[]'>, <int '1'>]
function.substitute_formal_args Matched arguments against sig(self: List[_T], object: _T2) -> None:
    self = List[Union[_T, _T2]]
function.substitute_formal_args param 0) self: GenericType(ClassType(__builtin__.list), (TypeParameter('_T', (), None, '__builtin__.list'),)) <=> <binding of variable 503 to data 139741517288264>
function.substitute_formal_args param 1) object: TypeParameter('_T2', (), None, '__builtin__.list.append') <=> <binding of variable 504 to data 139741517389512>
function.substitute_formal_args Using __builtin__.list.append._T2=<Variable v508: 1 choices> [<int [PyTDClass(int)]>]
function.substitute_formal_args Using typing.Sequence._T=<Variable v507: 0 choices> []
function._get_mutation Mutating _T to Union[_T, _T2]
analyze.trace_call Logging call to <binding of variable 500 to data 139741516595336> with 2 args, return <Variable v509: 1 choices>
abstract_utils.apply_mutations Applying mutations
abstract.merge_instance_type_parameter Modifying type param __builtin__.list._T
abstract_utils.apply_mutations Applied 1 mutations

That makes it sound like in the l2 case, pytype lost the List[str] information when evaluating the append call.

Going back to the variable declarations, we see this l1:

vm.log_opcode  >  | index: 0, '__init__', module: tmp2 line: 5
vm.log_opcode  >  | data_stack: ()
vm.log_opcode  >  | block_stack: ()
vm.log_opcode  >  | node: <15>__init__
vm.log_opcode  >  ## 5: 0: BUILD_LIST 0
abstract.merge_instance_type_parameter Modifying type param __builtin__.list._T
abstract._load_instance_type_parameters Initializing type param typing.Sequence._T: <Variable v385: 0 choices>
vm.log_opcode  >  | index: 1, '__init__', module: tmp2 line: 5
vm.log_opcode  >  | data_stack: (<Variable v414: 1 choices>,)
vm.log_opcode  >  | block_stack: ()
vm.log_opcode  >  | node: <16>5
vm.log_opcode  >  ## 5: 1: STORE_FAST l1  # type: List[str]
abstract_utils.eval_expr Evaluating expr: 'List[str]'
errors.checkpoint Checkpointing errorlog at 0 errors
vm.make_frame make_frame: callargs=None, f_globals=[LazyConcreteDict@7ff89ae9ff48], f_locals=[LazyConcreteDict@7f
f89afd1bc8]

l2:

vm.log_opcode  >  | index: 2, '__init__', module: tmp2 line: 6
vm.log_opcode  >  | data_stack: ()
vm.log_opcode  >  | block_stack: ()
vm.log_opcode  >  | node: <20>5
vm.log_opcode  >  ## 6: 2: BUILD_LIST 0
abstract.merge_instance_type_parameter Modifying type param __builtin__.list._T
abstract._load_instance_type_parameters Initializing type param typing.Sequence._T: <Variable v451: 0 choices>
vm.log_opcode  >  | index: 3, '__init__', module: tmp2 line: 6
vm.log_opcode  >  | data_stack: (<Variable v466: 1 choices>,)
vm.log_opcode  >  | block_stack: ()
vm.log_opcode  >  | node: <21>6
vm.log_opcode  >  ## 6: 3: LOAD_FAST self
vm.log_opcode  >  | index: 4, '__init__', module: tmp2 line: 6
vm.log_opcode  >  | data_stack: (<Variable v466: 1 choices>, <Variable v467: 1 choices>)
vm.log_opcode  >  | block_stack: ()
vm.log_opcode  >  | node: <21>6
vm.log_opcode  >  ## 6: 4: STORE_ATTR l2
attribute._set_member Setting l2 to the 1 values in <Variable v466: 1 choices>

Note the presence of Evaluating expr: 'List[str]' in l1 but not in l2.

pytype.pyc.opcodes.Opcode.annotation is None in the second case. This should be set in pytype.blocks, and isn't for l2. I can't yet tell who is supposed to set it

goncalopp avatar Oct 15 '20 16:10 goncalopp

Thanks for the report! Summary for myself to help with debugging: even with --check-attribute-types and --check-container-types both enabled, the List[str] annotation here does not prevent incorrect modifications of the contained type and is not preserved in the pyi:

from typing import List

class Foo:
    def __init__(self):
        self.l2 : List[str] = []
        self.l2.append(1)

rchen152 avatar Oct 15 '20 18:10 rchen152