typeshed
typeshed copied to clipboard
Run stubtest under different OS environments
This is an issue I'm having while adding D3DShot #8652 and PyAutoGUI #8654 stubs. Some modules are highly dependent on the OS, and as such, I have to add them to the stubtest_allowlist
, even though I can run it just fine locally.
For this reason, and to help with stub completeness (#8481), I suggest that stubtest should by default be run under Windows, Linux and Darwin (just like mypy and pyright already are).
This also means we should be able to specify the OS in stubtest_allowlist. Whether it's by having a specific file (stubtest_allowlist.linux.txt
), prefixing the line (linux:pyautogui._pyautogui_osx
) or any other proposals.
We already do this in CI for our stdlib stubs, and it works pretty well.
For most of our third-party stubs, it might be a bit overkill; but on the other hand, it probably wouldn't slow down CI that much to have three stubtest runs instead of one on a PR relating to third-party stubs.
I'm curious for @hauntsaninja's thoughts :)
I would honestly limit this to just select stubs that need it. We ran into CI limits in the past and we don't need to waste them on what is inconsequential for most stubs.
We ran into CI limits in the past and we don't need to waste them on what is inconsequential for most stubs.
It's an unrelated change, but we could consider doing something similar to https://github.com/python/mypy/pull/13249. That would reduce our risk of running into CI limits.
I don't think this is worth it for third party stubs. In practice, the only OS dependent things I remember in third party stubs are implementation details that we usually would avoid adding types for.
If the concern is fiddliness with the allowlists or local dev, consider using the empty regex trick, where if you do (entry)?
stubtest will not complain about that allowlist entry being unused.
@hauntsaninja The concern is more that for libraries with OS-specific support and modules, stubtest is pretty much useless in detecting missing or invalid types (because we have to exclude everything that doesn't work with the current CI configuration). This is especially prevalent with I/O (mouse, keyboard, displays*, ...) where the API changes wildly between OSes (so I'd say, it is wrong to assume that the only difference is implementation details).
I can fully understand concerns about CI limits however, especially when this scenario doesn't apply for most stubs. My current project just happens to use a handful of libraries that fall in this niche.
*of course displays would still be an issue, as the CI runs headless, but I think that's a different possible improvement/enhancement, if that's even feasable.
Perhaps we could run OS-specific stubtest only for specific packages where we know there are differences.
I just stumbled upon a new issue relating to this when working on pywin32 stubs: pytype and stubtest fail on CI with stubs that have no Linux distribution. Specifically,
pytype is a ton of pytype.load_pytd.BadDependencyError: Can't find pyi for '
[import]', referenced from '
[stub file]'
stubtest:
Testing pywin32...
Failed to install
ERROR: Could not find a version that satisfies the requirement pywin32==304.* (from versions: none)
ERROR: No matching distribution found for pywin32==304.*
pywin32... fail
Perhaps we could run OS-specific stubtest only for specific packages where we know there are differences.
Sounds reasonable to me. You could maybe even specify which OSes to run stubtests on. If unspecified then default to linux-only (the current behavior).
No idea what's causing the pytype errors :/ it'll be easier to take a look at those if you file the PR, then we can play around with it :)
For stubtest, for now you may need to just put skip = true
in the tool.stubtest
section of the METADATA.toml
file, if it's the case that it can't even be installed on linux.
Perhaps we could run OS-specific stubtest only for specific packages where we know there are differences.
Sounds reasonable to me. You could maybe even specify which OSes to run stubtests on. If unspecified then default to linux-only (the current behavior).
Yeah. It might be complicated, though.
Currently in our CI:
- Third-party stubtest is all run as one job run, on the same machine, in our
tests.yml
workflow (if at all -- it only runs for a certain stub if changes have been made to that stub). - Third-party stubtest is sharded into four runs, on four separate machines, in our
daily.yml
workflow.
We'd need to change that if we want to resolve this. One way to to do that might be to:
- Add an optional
os
field to thetool.stubtest
section of ourMETADATA.toml
files. This would default to"linux"
. - In CI, prior to running stubtest, iterate through each third-party distribution and create two lists: one list of the distributions that are to be run on linux (most of them), and another of the distributions that need to be run on Windows. We could use https://github.com/SebRollen/toml-action to help get the values from the
METADATA.toml
files. We could then use these lists to launch two stubtest jobs: one on linux, another on Windows.
@AlexWaygood I didn't feel confident posting the PR yet, I still had some work and improvements to do first (although I could post as WIP, but I'm really close to ready for review now). The pytype errors might have been something different because after a handful of updates (starting from combining bits of existing stubs out there, instead of from stubgen), it's no longer happening.
Here's the CI runs from my fork :
- started from stubgen: https://github.com/Avasam/typeshed/pull/3
- combining existing stubs: https://github.com/Avasam/typeshed/pull/4
I wouldn't waste your time investigating this until I make a proper PR and see if there's an actual issue left with pytype.
And skip=true
still works to get stubtests to pass. I'm just adding another reason why some stubs may have a reason to be run on windows.
Yeah. It might be complicated, though.
I don't doubt ^^" This doesn't feel like a low-hanging fruit.
There several thoughts from my side. If we allow to run stub packages on different OSes, we would need:
- [ ] Change how
apt_dependencies
work and probably add something for other platforms if needed - [ ] Change single
stubtest_allowlist.txt
to support several files: one for each OS and one for common ones (likestdlib
does) - [ ] Run OS specific jobs conditionally, because it will be a waste to run each package on all possible OSes. Many stubs do not care about the OS at all.
I think we can start incrementally by adding os
field to METADATA.toml
.
CPython already has conditional jobs, we can do it the same way: https://github.com/python/cpython/blob/0895c2a066c64c84cab0821886dfa66efc1bdc2f/.github/workflows/build.yml#L34-L61 For example, this code check whether or not we need to run tests. We can do the same for different OS jobs.
I will send a prototype PR for this today :)
We ran into CI limits in the past and we don't need to waste them on what is inconsequential for most stubs.
Btw, I can ask GitHub to give us more minutes if it is an issue. I think that they will, because this project is quite important.
Fixed in #8923 (thanks so much @sobolevn!).
Let's deal with any follow-up problems in separate issues/PRs, if and when we come across them.