typeshed
typeshed copied to clipboard
Allow non-types dependencies
This was previously discussed in #5618 and came up in #5765. I think the consensus is that we should allow this. This issue is mainly to track this.
Copying relevant discussion from #5618 for context:
Not speaking about stubs packages in general, but I think it's reasonable if typeshed packages depend on dependencies of the corresponding upstream package. I.e. if package
foo
depends oncryptography
, it's fine that ourfoo-types
depends oncryptography
as well, since it doesn't introduce "new" dependencies. Apart from that we should vet dependencies carefully.
I agree with srittau.
I think that it would be sufficient to maintain an allowlist of vetted dependencies. I'd expect that the number of such dependencies will be fairly small, so it shouldn't be much effort to maintain.
I'm running into this issue for tensorflow stubs. Most tensorflow type annotations depend on a couple basic numpy types. So at moment my options look like don't include numpy types and have most annotations be too strict or have Any almost everywhere.
What's needed to work on this task? I can see one issue is pyright check produces errors from dependency not being installed. Is the prerequisite being able to run checks for only changed stubs as mentioned here, https://github.com/python/typeshed/issues/5952
If I understand correctly, the next steps are:
- Add a check to
tests/check_consistent.py
that fails ifstubs/foo
depends on anything else than:- A
types-bar
package wherebar
is so thatstubs/bar
exists in typeshed - Or a dependency of
foo
on pypi that haspy.typed
- We can use pypi's API for this, although checking the presence of
py.typed
with just the API is tricky. Can we just assume thatpy.typed
exists, or do we need to potentially download the sdist to check? How about packages that don't provide sdist?
- We can use pypi's API for this, although checking the presence of
- A
- Change one dependency of one stubs-package to use upstream types
- Make a pull request
- Fix pull request until CI is green
- Before merging, run the third-party stubs check in
daily.yml
passes (this can be ran manually)
A potential problem is if packages foo
and bar
have dependencies that cannot be installed simultaneously. Then daily.yml
will not work at all. I think we should fix this later, in a separate issue and a separate PR, because users are already complaining about not being able to use upstream type hints, and this issue won't come up until we have many non-stubs dependencies in typeshed.
If we have a package foo
with dependency bar
and stubs in stubs/foo
, we probably want types-foo
to depend on bar
and not a specific version like bar==1.2.3
, so that you can upgrade to bar==1.2.4
without breaking your pip install -r requirements.txt -r requirements-dev.txt
(assuming bar==1.2.4
is in requirements.txt
and types-foo==x.y.z
is in requirements-dev.txt
).
Is this all there's to it, or am I missing something? I haven't followed this discussion in detail before, because it seemed to be complicated and also spread up across many places.
We need changes to mypy_test and the others, see e.g. https://github.com/python/typeshed/issues/5952#issuecomment-1046192164
We need some changes in stub_uploader as well, which has validation around this. And I think it's a good opportunity to do https://github.com/python/typeshed/issues/8312
I have the stub_uploader patch locally, but haven't PR-ed it because of some conflicts with https://github.com/typeshed-internal/stub_uploader/pull/57 which I was hoping would get merged
I'm mostly confused about "changes to mypy_test and the others" part.
- Why exactly is https://github.com/python/typeshed/issues/5952#issuecomment-1046192164 relevant now? It says "I personally wouldn't hate getting started on [this issue] with a global venv" which is what I'm suggesting.
- Is all this included in my "Fix pull request until CI is green" part, or is there something subtle that our CI wouldn't catch if forgotten?
Our CI wouldn't catch changes needed for stub_uploader. But yes, everything else is probably covered under "get CI green".
Here's code that checks if a package contains a py.typed: https://github.com/python/typeshed/blob/cc7d2567f2fbece72dfd42d233a5a1c48d56ea16/scripts/stubsabot.py#L141 But we probably don't need it; we should get errors in type checking tests if something doesn't have py.typed. People have previously suggested using an allowlist of packages (I believe for security reasons), which would also take care of that.
I personally don't see how duplicating the dependencies into a separate allowlist in the same repository improves security.
It'd be in stub_uploader. I don't have opinions on it, but mentioned it as a thing that has been previously discussed, since e.g. an allowlist would make validation like checking for py.typed not very important.
As I mentioned on the PR https://github.com/typeshed-internal/stub_uploader/pull/59 this introduces a security hole. Having an allowlist in stub_uploader
(to which intentionally only few maintainers have access) is a possible mitigation, but I want to first hear what other people (in particular @JukkaL and @JelleZijlstra) think about this.
this introduces a security hole
Could you expand on what the security risk is, exactly? Like @Akuli, I'm not sure I've ever seen it fully explained (probably because I'm somewhat new as a maintainer :-)
I agree with the consensus on this issue that this is a feature we really need at this point. The situation with types-cryptography
is becoming something of a farce. We've also got an increasing number of packages that would benefit from having numpy
as a dependency.
Arbitrary code can be executed during a Python package installation, so if someone adds a malicious package as a dependency, it can cause really big damage, as many of types-...
packages have very high install rates (like 100K+ per day).
Arbitrary code can be executed during a Python package installation
That's surely a security risk for any open-source package depending on another open-source package -- is the security hole here that users might not be expecting the installation of a stubs package to install, e.g., numpy
or whatever as a side effect?
The risk is that someone my add e.g. nmupy
as a dependency (whether intentionally or not), so typosquatting, that can't usually harm many people, will now get magnified to some cosmic scales, as you just need one typo to break hundreds thousands users.
And also yes, having this as a side effect of a stub package, will make it even harder to detect.
Got it, thanks!
This might be a silly suggestion, but couldn't those non-types dependencies somehow be specified as optional/extra dependencies that are installed in typeshed CI, but not by default for end-users of resulting stubs?
This might be a silly suggestion, but couldn't those non-types dependencies somehow be specified as optional/extra dependencies that are installed in typeshed CI, but not by default for end-users of resulting stubs?
That's what I'd expect as well.
This might be a silly suggestion, but couldn't those non-types dependencies somehow be specified as optional/extra dependencies that are installed in typeshed CI, but not by default for end-users of resulting stubs?
But those are usually dependencies of the stubs, i.e. the stubs won't work without the dependency. That's usually not a problem, since if package types-foo
depends on packages bar
, package foo
will usually also depend on bar
, but there are use cases where you want to use types-foo
standalone.
This might be a silly suggestion, but couldn't those non-types dependencies somehow be specified as optional/extra dependencies that are installed in typeshed CI, but not by default for end-users of resulting stubs?
But those are usually dependencies of the stubs, i.e. the stubs won't work without the dependency. That's usually not a problem, since if package
types-foo
depends on packagesbar
, packagefoo
will usually also depend onbar
, but there are use cases where you want to usetypes-foo
standalone.
I see, but in case you want to use types-foo
standalone, you could choose to specify it as types-foo[standalone]
(or whatever).
But those are usually dependencies of the stubs, i.e. the stubs won't work without the dependency. That's usually not a problem, since if package
types-foo
depends on packagesbar
, packagefoo
will usually also depend onbar
, but there are use cases where you want to usetypes-foo
standalone.
I'm sorry, I might be missing something, but wouldn't that be fine anyway?
My understanding is that there's a security concern about someone adding a malicious dependency in a types-*
package which would be automatically installed on thousands of users' machine. But if it's set up so that the dependency is installed on typeshed's CI when validating the types, but not as a dependency when uploaded to PyPI, wouldn't that mitigate that concern?
As for users potentially missing a dependency, as you mentioned,
if package types-foo depends on packages bar, package foo will usually also depend on bar
Even in those rare cases where foo
does not also depend on bar
(for example: optional dependencies), we wouldn't have had complete types anyway, and the user can just manually install the missing dependency. So we'd go from never having complete types for packages that need non-types-*
dependencies, to very rarely having to manually install a missing dependency. Which I only see as a positive.
And finally, as for using types-foo
standalone, I'm not entirely certain what you mean by "standalone" here. Like just installing types-foo
without it also installing its type dependencies? Wouldn't this be the case with this suggestion?
Users need to install the dependencies of stubs. This is not about typeshed CI.
If stub for great_numpy_tools
looks like:
def make_my_array_good(x: np.ndarray) -> whatever: ...
then users of this stub will not get correct type checking unless numpy is also installed in the same environment.
The only way to guarantee numpy is installed in the same environment (aka to guarantee users of the stub get correct type checking) is to make it a dependency of types-great-numpy-tools
add e.g. nmupy as a dependency (whether intentionally or not)
We could fail typeshed's CI and/or stub-uploader if types-foo
depends on something other than the dependencies of foo
on pypi. Then the only way to make types-foo
depend on nmupy
would be if foo
depends on nmupy
as well, and I generally expect types-foo
to have less downloads on pypi than foo
so at that point the damage has been already done.
There's also one special case: types-gdb
doesn't have a corresponding package on pypi, and we shouldn't implicitly trust it if someone creates it.
Great, looks like we're seemingly at consensus on Akuli's suggestion. I've implemented all of this locally for stub_uploader, will open a PR once https://github.com/typeshed-internal/stub_uploader/pull/59 is merged (which should be uncontroversial following my latest updates to it).
PR at https://github.com/typeshed-internal/stub_uploader/pull/61
Reopening this for now, as we'll still need to rework a few test scripts on the typeshed side for this to work (but hooray! 🥳)
I can have a stab at getting mypy_test.py
and regr_test.py
to work with non-types dependencies. I don't know where to start when it comes to getting pyright and pytype to work, though.
I don't know where to start when it comes to getting pyright and pytype to work, though.
Sorry if I misunderstand, but isn't it just a matter of installing non-tyshed-stub dependencies from METADATA.toml?
Couldn't a script be made for that, and re-used. Whether it's in a virtual environment for testing. Or run directly as a CI step before the type checker. (Since Pyright is ran from a github action, we can't tack this on an existing script).
Couldn't a script be made for that, and re-used.
Quite possibly. My point was just that I haven't really looked at pytype_test.py
, and don't really know how it works, whereas I know pretty well how mypy_test.py
and regr_test.py
work :)
@AlexWaygood I'd love to see this completed. Is there anything I could help with to free up some of your (and others') time?
I'd love to see it completed as well, but just haven't had the time to work on it lately, with all the other typeshed-related things going on :(
If you fancy taking a look at getting pytype_test.py
and/or pyright to work with non-types dependencies, that would be a huge help! I don't think all the scripts/workflows need to be updated in one PR -- we can make changes on the basis of "this would make this script compatible with non-types dependencies, if we had any" for now. PRs updating mypy_test.py
and/or regr_test.py
would also be very welcome -- I haven't had the time to seriously look at them yet :(