Typeshed path override breaks stdlib types
Describe the Bug
Repro:
- In the root of your project, run clone typeshed
git clone https://github.com/python/typeshed.git - Make a pyrefly.toml with the following contents
typeshed-path="./typeshed" - Make a file with the following contents
x: int = 1 - Observe the following error:
Literal[1]` is not assignable to `int
I also confirmed that typeshed-path="./typeshed/stdlib" does not work - we fall back to the bundled typeshed and do not pick up any added types if I modify the typeshed source.
Sandbox Link
No response
(Only applicable for extension issues) IDE Information
No response
IIRC the system currently is never designed for supporting a custom typeshed and @ndmitchell was pretty skeptical about this feature. Can you elaborate what the intended use case for this is?
I utilized Pyrefly to check for missing stubs in Typeshed against the CPython documentation. https://github.com/python/typeshed/pull/14938#issuecomment-3464070195
@grievejia @yangdanny97 Would be great if you could weight in the type checking side.
I spent a little time looking into this and here is what I think is happening. When we use a custom typeshed path this may lead to 2 different definitions of a class. We consider 2 Classes to be equal only if the follow are the same:
- Same ClassDefIndex
- Same ModuleName
- Same ModulePath
However, when we load the int we may be loading definitions from both the bundled typeshed as well as from the custom typeshed path provided.
When type checking x: int = 1:
1. **Annotation side**: Pyrefly resolves `int` using the **custom typeshed** → `Class(def_index=X, module="builtins", path=FileSystem(...))`
2. **Value side**: Pyrefly evaluates `1` as `Literal[1]` and calls `lit.general_class_type(stdlib)` to convert it to `int`
- This uses `stdlib.int()` which was initialized from the **bundled typeshed** → `Class(def_index=Y, module="builtins", path=BundledTypeshed(...))`
3. **Compatibility check**: Pyrefly checks if `Literal[1]` (which converts to bundled `int`) is assignable to annotation `int` (which is custom `int`)
- The check compares: `Class(path=BundledTypeshed(...))` vs `Class(path=FileSystem(...))` which are not equal.
We are using 2 different instances of the Int class which is resulting in an error. I think this should be easy enough to fix but it seems there is a question of this is something we to support which is a larger discussion.
I think we should try our best to support it if typeshed-path is set. What's the concern about supporting it?
If someone chooses to set a custom path that's invalid and causes a crash that's on them
FYI this looks a lot like an issue that I ran into a few months ago while trying to make it possible to manually put typeshed at the start of the search path. I worked around the problem by adding a PYREFLY_STDLIB_SEARCH_PATH environment variable: https://github.com/facebook/pyrefly/commit/a6d7f77ad4f5e4cf6ff93dee1477643816bcf15f.
Should we make this happen automatically if this config value is passed? Otherwise the typeshed-path config seems sort of redundant
@jvansch1
However, when we load the int we may be loading definitions from both the bundled typeshed as well as from the custom typeshed path provided.
My expectation is that a user-provided typeshed would take precedence over the bundled typeshed and we should not even try to load the bundled one, so there should be no conflict.
I'm not too familiar with the code, though, so maybe you can help me understand where my expectation breaks down?
It's been a while since I dug into this, but if memory serves, the problem was related to the Stdlib object always using the bundled typeshed: https://github.com/facebook/pyrefly/blob/808dd1bf067268b4c82baf35f9a1bffeebee64c2/crates/pyrefly_types/src/stdlib.rs#L30
It looks like the Stdlib code is already generic over a "loader" but we always use the BundledTypeshedStdlib loader here:
https://github.com/facebook/pyrefly/blob/808dd1bf067268b4c82baf35f9a1bffeebee64c2/pyrefly/lib/state/state.rs#L1107-L1120
So, hopefully the fix entails defining a loader for the user-provided typeshed instead and threading the config setting to this point so we can choose the right loader?
One thing that might be related is that we don't rewrite the typeshed path. I'll throw up a diff for doing that real quick
Edit: created #1741