Bundle popular types libraries with the extension
Pyrefly's IDE extension should ship with popular library type stubs.
Which stubs should we include? Ideas:
- Top stubs libraries by pypi installs: https://python-type-checking.com/ (check to see if actively maintained)
- Microsofts: https://github.com/microsoft/python-type-stubs (some are getting moved into typeshed)
Type checking considerations:
- The actual source code is not present: we need a way to detect if a library is used, but the actual source code is not present and give a special error so it doesn't fail at runtime.
- How does the user know to include these in their project if they want types for the checker/CI?
cc @yangdanny97
I have #572 open for typeshed bundled stubs too :)
How will pyrefly deal with version compatability?
For example, scipy-stubs has a 4-digit version number, e.g. 1.16.2.3, where the first three versions refer to the current scipy version, and the last one is managed by scipy-stubs itself. Typeshed uses a similar version strategy, but has a date-based 4th version number instead of an incremental one like scipy-stubs. But the overall pattern seems to be the same, where the stub package directly contains the version number of the targeted target library.
So will pyrefly bundle a bunch of versions of each stub library? I can't think of any other way, but I can imagine that the pyrefly package could become pretty large this way. Thinking out loud: The source tarball of the latest scipy-stubs release is 355.4 kB, and there are 8 separate scipy versions for which there are releases, so that's several megabytes for one stub package already.
Also, how will users know where to report issues with the stubs? When they've installed the stubs themselves, it's probably more obvious to them.
- Microsofts: microsoft/python-type-stubs (some are getting moved into typeshed)
Take a good look at those stubs before you include them. Before I wrote scipy-stubs, they also bundled stubs for scipy, but those were outdated by 5 years, and fully generated by an even older version of pyright. In my experience those stubs were causing more issues than it solved.
But I don't is also the case for the other stubs, so they might be perfectly fine. It's just that I think it might be a good idea to check them before including them, that's all.
@jorenham, have there been any discussions of distributing the stubs with the scipy package so a separate install is not needed? As you point out, versioning becomes a big challenge when type stubs are distributed separately from libraries. Pushing this burden on type checker maintainers and users to deal with this versioning problem is less than ideal. Do you see downsides to packaging type information directly in scipy and other popular packages?
@jorenham, have there been any discussions of distributing the stubs with the scipy package so a separate install is not needed?
Yea a couple of times, in https://github.com/scipy/scipy/issues/21614 for example, and more recently in https://github.com/scipy/scipy-stubs/discussions/884, which summarized the most important reasons for having a separate scipy-stubs package. For the most part, those reasons should also be applicable to other large stubs-only packages like django-stubs.
As you point out, versioning becomes a big challenge when type stubs are distributed separately from libraries. Pushing this burden on type checker maintainers and users to deal with this versioning problem is less than ideal.
I know pyright indeed bundles some stub packages, but last time I checked, scipy-stubs wasn't one of those, right? So I'm not sure what this burden is that you're describing. For what it's worth, I never intended for scipy-stubs to become a burden for type-checkers in any way; quite the opposite actually.
Do you see downsides to packaging type information directly in scipy and other popular packages?
See https://github.com/scipy/scipy-stubs/discussions/884
I know pyright indeed bundles some stub packages, but last time I checked, scipy-stubs wasn't one of those, right?
That's correct. Pyright bundles typeshed (including both the stdlib stubs and third-party stubs in typeshed), but it doesn't bundle any other stubs packages.
For what it's worth, I never intended for scipy-stubs to become a burden for type-checkers in any way; quite the opposite actually.
Sorry, I didn't mean to imply that. I'm a huge fan of the work you've done to provide typing for scipy and other libraries! However, type stubs that are distributed separately from libraries creates discoverability and versioning issues for users and tools vendors. As you point out above, the versioning problem is exacerbated by the fact that there is no standard for versioning (including the semantic meaning of each semver component). For these reasons, I encourage library maintainers to consider including type information directly into their package. As you point out in the linked issues, there are sometimes reasons why this isn't feasible or desirable.
Anyway, I didn't mean to start a discussion on this topic in the pyrefly issue tracker. I was just curious. Thanks for the answer.
As you point out above, the versioning problem is exacerbated by the fact that there is no standard for versioning (including the semantic meaning of each semver component).
Yea that's indeed a pretty tricky issue. The best we can do right now is to carefully specify the dependency restrictions in the pyproject.toml of the stub package, and then hope that the user's package manager will then install the correct versions.
For these reasons, I encourage library maintainers to consider including type information directly into their package.
Agreed! I believe that in case of typeshed's stub packages, this is actively being worked on, which I think is pretty awesome.
As you point out in the linked issues, there are sometimes reasons why this isn't feasible or desirable.
Those reasons probably only apply to large libraries and frameworks like Django and SciPy. So I agree that for most libraries it's better to package it all together.
Great discussion all! Here is how I personally think about stubs for IDE and type checking. Our team is still discussing it, so I'm not speaking for the whole Pyrefly team yet.
Priorities
- Code navigation (go to def and autocomplete) should work well for as many users as possible.
- Stubs, when available (and high quality), should be made visible to the user so they can have accurate types and type checking. This means also getting the right stubs for the right package versions.
- Stubs should be shipped with the source code to reduce burden on typeshed maintainers, type checker maintainers and python developers.
Proposal
- IDE extensions should ship and include typeshed along with high quality** (needs definition, but lets say hand curated list of popular stub packages). If no configuration is present, then we will use the bundled stubs that we choose to ship. We will try to always ship the latest version and not try to figure out the right version for installed packages.
- If a user has a configuration file present, we should use typeshed, but not extra stubs. Rather, prompt the user to install additional stubs package from the curated list and, as best we can, help point to the right version. This ensures type checking will be accurate, but also risks a degradation in codenav. It's up to the extension maintainer to do their best to make visible what stubs are, why they are important and how to use them properly.
- Continue to work through the spreadsheet here and attempt to get package maintainers to include types from typeshed or third party stubs in the package itself: http://docs.google.com/spreadsheets/d/1s1W24KGMu0mrbXhHIcFKraAaJY5OOc2CjJvRvGx68d4/edit?gid=1884751978#gid=1884751978.
Next, it would be good to curate the 3P stubs we want included with Pyrefly and make sure this issue is continually raised in the community to help encourage package maintainers to pull in stubs. FWIW there seems to be more interest in pulling these in lately unless the package is unmaintained.