Pyrefly cannot infer obvious subtyping opportunities
Describe the Bug
When authoring code, I find myself extremely annoyed that pyrefly cannot understand what I mean in cases like the linked sandbox
Sandbox Link
https://pyrefly.org/sandbox/?code=MYGwhgzhAEBCkFMBcAoa7oAdIRS0O0AwgBYCWIAJgIwAU8ECAlKhljngVMeVQEz1ELNBmxRO4bqQqUAzIMbC2Y3CjAAnddAC80ANrSqdJgBoeMgUwC6azQDowmTAgB2lWobm0mTIA
(Only applicable for extension issues) IDE Information
No response
This is an interesting one, since the typechecker's inferred type is trying to guess the user's intent and different checkers make different tradeoffs here:
Pyrefly pins the type of the list based on the first time you put something into it (in this case, it's initialized with some values so it's pinned immediately). When we insert a value with an incompatible type, we can either modify the list's type to account for the new value, or raise an error.
Right now, we do the latter, since the former is tricky to get right - if this was indeed a mistake by the user, the type error will be emitted where the list is used, and they would need to track down the place they inserted the incompatible value.
Mypy doesn't infer unions in these situations - instead, it infers a parent class between the two initial members to get list[Base]. In some situations this is undesirable: for example, if your list contains two unrelated types Mypy will infer list[object] that you can insert anything into.
Pyright and Ty infer list[Unknown] which means you can also put anything into it, and anything you read from it will be Any.
I'm curious as to which model best captures your intent for this situation.
I think the right algorithm might be that given two types in a mutable container (A and B), if they have a non-object common supertype, we use that as the inferred type. It seems to be a better default than A | B. CC @stroxler as we discussed this at PyCon.
I'm curious as to which model best captures your intent for this situation.
I'm most interested in modifying the type over time as things are inserted into the container. I want these examples to work correctly:
https://pyrefly.org/sandbox/?code=GYJw9gtgBALgngBwJYDsDmUkQWEMoCGAzkQKZ4D68CpAULQSCFALxQDaAug0wHQEIaKACYAKAIwBKBiXIwqiUqMYgANFAA2SIjHaoYnaT2Zt249QCZ1AZm4r+g0iIm8pMspWpKV6rTr0o+AA+UMAaYAQGRgDGGsREUABCxKQAXLRQmVAI8fSx8VAAwgAWSBpiyWSS6VnZucasHJVKknZ8AkJiJWViktLxcgo0yky+2rrNhvQqjezd5aKS6vO9bSAOnaLNi-2ynoojaprj7JNGM6Yri+rbrcYbThUpO+6DXodj-mdAA
if this was indeed a mistake by the user, the type error will be emitted where the list is used, and they would need to track down the place they inserted the incompatible value.
I'd argue that inserting types that aren't compatible with the previously-inserted types is acceptable because it doesn't matter if the contents change over time until they are read. In other words, the type error should be where the error is actually a problem, rather than merely a change in the contents of the type. As a concrete example I want the error to be on the for loop rather than the .append() here:
https://pyrefly.org/sandbox/?code=GYexAIF5wbQRgDTgEQENkF0BQoQDpUAHQgUwDsATACgDkQySBKLLGYcAKnACZxQAnPuACWZPmAxA
(phrased a third way, I want the typechecker to prevent a program from crashing at runtime. Inserting mixed contents into a container doesn't ever crash, so an error there is not useful to me)
Yeah, I see where you're coming from here, where perhaps we're being too eager in telling the user "we think this is where the actual error is".
I'm worried that modifying the type as we go & emitting errors only when the type is used can lead to errors that are difficult to debug.
Given something like this:
x = [1, 2, 3]
# lots of code
takes_int_list(x)
If there's a x.append("foo") somewhere in the "lots of code" part, we'd get an error on takes_int_list(x), but the user would be left on their own to try and hunt down the problematic append (just like how if takes_int_list(x) crashes at runtime the user would have to read through a ton of code and figure out where the bad data was inserted).
I wonder if we could build a clever system that automatically finds "where was the type refined to be something that's not compatible" to display that in the error message (I believe flow will show where a value's type was last refined in the IDE, so a fancier version of that).
But I also think some of the usability issues you raised could be addressed by using join instead of meet in certain cases based on a heuristic, and also by handling None better. Those fixes should be easier to do than refining the type of something on every mutation - I'm not quite sure how we would do that.
If there's a x.append("foo") somewhere in the "lots of code" part, we'd get an error on takes_int_list(x), but the user would be left on their own to try and hunt down the problematic append (just like how if takes_int_list(x) crashes at runtime the user would have to read through a ton of code and figure out where the bad data was inserted).
This matches the behavior of the Hack and Flow typecheckers. The user-remedy there is to just annotate the type with what you expect at the declaration site of the variable. Then, the typechecker will enforce the type is what you expect and the error is immediately obvious!
Moving this from an "always have to do this to fix the type" to "only do this when needed to clarify expected usage for the typechecker" helps velocity.
I also think, having written a lot of this sort of code that if there is that much code between declaration and usage-site that's probably smelly code anyway :P