typeshed icon indicating copy to clipboard operation
typeshed copied to clipboard

Allow non-types dependencies

Open srittau opened this issue 3 years ago • 24 comments

This was previously discussed in #5618 and came up in #5765. I think the consensus is that we should allow this. This issue is mainly to track this.

srittau avatar Jul 13 '21 10:07 srittau

Copying relevant discussion from #5618 for context:

srittau commented on Jun 18

Not speaking about stubs packages in general, but I think it's reasonable if typeshed packages depend on dependencies of the corresponding upstream package. I.e. if package foo depends on cryptography, it's fine that our foo-types depends on cryptography as well, since it doesn't introduce "new" dependencies. Apart from that we should vet dependencies carefully.

JukkaL commented on Jun 18

I agree with srittau.

I think that it would be sufficient to maintain an allowlist of vetted dependencies. I'd expect that the number of such dependencies will be fairly small, so it shouldn't be much effort to maintain.

intgr avatar Aug 05 '21 13:08 intgr

I'm running into this issue for tensorflow stubs. Most tensorflow type annotations depend on a couple basic numpy types. So at moment my options look like don't include numpy types and have most annotations be too strict or have Any almost everywhere.

What's needed to work on this task? I can see one issue is pyright check produces errors from dependency not being installed. Is the prerequisite being able to run checks for only changed stubs as mentioned here, https://github.com/python/typeshed/issues/5952

hmc-cs-mdrissi avatar Feb 20 '22 08:02 hmc-cs-mdrissi

If I understand correctly, the next steps are:

  • Add a check to tests/check_consistent.py that fails if stubs/foo depends on anything else than:
    • A types-bar package where bar is so that stubs/bar exists in typeshed
    • Or a dependency of foo on pypi that has py.typed
      • We can use pypi's API for this, although checking the presence of py.typed with just the API is tricky. Can we just assume that py.typed exists, or do we need to potentially download the sdist to check? How about packages that don't provide sdist?
  • Change one dependency of one stubs-package to use upstream types
  • Make a pull request
  • Fix pull request until CI is green
  • Before merging, run the third-party stubs check in daily.yml passes (this can be ran manually)

A potential problem is if packages foo and bar have dependencies that cannot be installed simultaneously. Then daily.yml will not work at all. I think we should fix this later, in a separate issue and a separate PR, because users are already complaining about not being able to use upstream type hints, and this issue won't come up until we have many non-stubs dependencies in typeshed.

If we have a package foo with dependency bar and stubs in stubs/foo, we probably want types-foo to depend on bar and not a specific version like bar==1.2.3, so that you can upgrade to bar==1.2.4 without breaking your pip install -r requirements.txt -r requirements-dev.txt (assuming bar==1.2.4 is in requirements.txt and types-foo==x.y.z is in requirements-dev.txt).

Is this all there's to it, or am I missing something? I haven't followed this discussion in detail before, because it seemed to be complicated and also spread up across many places.

Akuli avatar Sep 07 '22 21:09 Akuli

We need changes to mypy_test and the others, see e.g. https://github.com/python/typeshed/issues/5952#issuecomment-1046192164

We need some changes in stub_uploader as well, which has validation around this. And I think it's a good opportunity to do https://github.com/python/typeshed/issues/8312

I have the stub_uploader patch locally, but haven't PR-ed it because of some conflicts with https://github.com/typeshed-internal/stub_uploader/pull/57 which I was hoping would get merged

hauntsaninja avatar Sep 07 '22 22:09 hauntsaninja

I'm mostly confused about "changes to mypy_test and the others" part.

  • Why exactly is https://github.com/python/typeshed/issues/5952#issuecomment-1046192164 relevant now? It says "I personally wouldn't hate getting started on [this issue] with a global venv" which is what I'm suggesting.
  • Is all this included in my "Fix pull request until CI is green" part, or is there something subtle that our CI wouldn't catch if forgotten?

Akuli avatar Sep 07 '22 22:09 Akuli

Our CI wouldn't catch changes needed for stub_uploader. But yes, everything else is probably covered under "get CI green".

Here's code that checks if a package contains a py.typed: https://github.com/python/typeshed/blob/cc7d2567f2fbece72dfd42d233a5a1c48d56ea16/scripts/stubsabot.py#L141 But we probably don't need it; we should get errors in type checking tests if something doesn't have py.typed. People have previously suggested using an allowlist of packages (I believe for security reasons), which would also take care of that.

hauntsaninja avatar Sep 07 '22 22:09 hauntsaninja

I personally don't see how duplicating the dependencies into a separate allowlist in the same repository improves security.

Akuli avatar Sep 07 '22 22:09 Akuli

It'd be in stub_uploader. I don't have opinions on it, but mentioned it as a thing that has been previously discussed, since e.g. an allowlist would make validation like checking for py.typed not very important.

hauntsaninja avatar Sep 07 '22 22:09 hauntsaninja

As I mentioned on the PR https://github.com/typeshed-internal/stub_uploader/pull/59 this introduces a security hole. Having an allowlist in stub_uploader (to which intentionally only few maintainers have access) is a possible mitigation, but I want to first hear what other people (in particular @JukkaL and @JelleZijlstra) think about this.

ilevkivskyi avatar Sep 19 '22 09:09 ilevkivskyi

this introduces a security hole

Could you expand on what the security risk is, exactly? Like @Akuli, I'm not sure I've ever seen it fully explained (probably because I'm somewhat new as a maintainer :-)

I agree with the consensus on this issue that this is a feature we really need at this point. The situation with types-cryptography is becoming something of a farce. We've also got an increasing number of packages that would benefit from having numpy as a dependency.

AlexWaygood avatar Sep 19 '22 10:09 AlexWaygood

Arbitrary code can be executed during a Python package installation, so if someone adds a malicious package as a dependency, it can cause really big damage, as many of types-... packages have very high install rates (like 100K+ per day).

ilevkivskyi avatar Sep 19 '22 10:09 ilevkivskyi

Arbitrary code can be executed during a Python package installation

That's surely a security risk for any open-source package depending on another open-source package -- is the security hole here that users might not be expecting the installation of a stubs package to install, e.g., numpy or whatever as a side effect?

AlexWaygood avatar Sep 19 '22 10:09 AlexWaygood

The risk is that someone my add e.g. nmupy as a dependency (whether intentionally or not), so typosquatting, that can't usually harm many people, will now get magnified to some cosmic scales, as you just need one typo to break hundreds thousands users.

ilevkivskyi avatar Sep 19 '22 11:09 ilevkivskyi

And also yes, having this as a side effect of a stub package, will make it even harder to detect.

ilevkivskyi avatar Sep 19 '22 11:09 ilevkivskyi

Got it, thanks!

AlexWaygood avatar Sep 19 '22 11:09 AlexWaygood

This might be a silly suggestion, but couldn't those non-types dependencies somehow be specified as optional/extra dependencies that are installed in typeshed CI, but not by default for end-users of resulting stubs?

jakob-keller avatar Sep 19 '22 11:09 jakob-keller

This might be a silly suggestion, but couldn't those non-types dependencies somehow be specified as optional/extra dependencies that are installed in typeshed CI, but not by default for end-users of resulting stubs?

That's what I'd expect as well.

Avasam avatar Sep 19 '22 11:09 Avasam

This might be a silly suggestion, but couldn't those non-types dependencies somehow be specified as optional/extra dependencies that are installed in typeshed CI, but not by default for end-users of resulting stubs?

But those are usually dependencies of the stubs, i.e. the stubs won't work without the dependency. That's usually not a problem, since if package types-foo depends on packages bar, package foo will usually also depend on bar, but there are use cases where you want to use types-foo standalone.

srittau avatar Sep 19 '22 11:09 srittau

This might be a silly suggestion, but couldn't those non-types dependencies somehow be specified as optional/extra dependencies that are installed in typeshed CI, but not by default for end-users of resulting stubs?

But those are usually dependencies of the stubs, i.e. the stubs won't work without the dependency. That's usually not a problem, since if package types-foo depends on packages bar, package foo will usually also depend on bar, but there are use cases where you want to use types-foo standalone.

I see, but in case you want to use types-foo standalone, you could choose to specify it as types-foo[standalone] (or whatever).

jakob-keller avatar Sep 19 '22 11:09 jakob-keller

But those are usually dependencies of the stubs, i.e. the stubs won't work without the dependency. That's usually not a problem, since if package types-foo depends on packages bar, package foo will usually also depend on bar, but there are use cases where you want to use types-foo standalone.

I'm sorry, I might be missing something, but wouldn't that be fine anyway?
My understanding is that there's a security concern about someone adding a malicious dependency in a types-* package which would be automatically installed on thousands of users' machine. But if it's set up so that the dependency is installed on typeshed's CI when validating the types, but not as a dependency when uploaded to PyPI, wouldn't that mitigate that concern?

As for users potentially missing a dependency, as you mentioned,

if package types-foo depends on packages bar, package foo will usually also depend on bar

Even in those rare cases where foo does not also depend on bar (for example: optional dependencies), we wouldn't have had complete types anyway, and the user can just manually install the missing dependency. So we'd go from never having complete types for packages that need non-types-* dependencies, to very rarely having to manually install a missing dependency. Which I only see as a positive.

And finally, as for using types-foo standalone, I'm not entirely certain what you mean by "standalone" here. Like just installing types-foo without it also installing its type dependencies? Wouldn't this be the case with this suggestion?

Avasam avatar Sep 19 '22 17:09 Avasam

Users need to install the dependencies of stubs. This is not about typeshed CI.

If stub for great_numpy_tools looks like:

def make_my_array_good(x: np.ndarray) -> whatever: ...

then users of this stub will not get correct type checking unless numpy is also installed in the same environment.

The only way to guarantee numpy is installed in the same environment (aka to guarantee users of the stub get correct type checking) is to make it a dependency of types-great-numpy-tools

hauntsaninja avatar Sep 19 '22 18:09 hauntsaninja

add e.g. nmupy as a dependency (whether intentionally or not)

We could fail typeshed's CI and/or stub-uploader if types-foo depends on something other than the dependencies of foo on pypi. Then the only way to make types-foo depend on nmupy would be if foo depends on nmupy as well, and I generally expect types-foo to have less downloads on pypi than foo so at that point the damage has been already done.

There's also one special case: types-gdb doesn't have a corresponding package on pypi, and we shouldn't implicitly trust it if someone creates it.

Akuli avatar Sep 19 '22 18:09 Akuli

Great, looks like we're seemingly at consensus on Akuli's suggestion. I've implemented all of this locally for stub_uploader, will open a PR once https://github.com/typeshed-internal/stub_uploader/pull/59 is merged (which should be uncontroversial following my latest updates to it).

hauntsaninja avatar Sep 21 '22 08:09 hauntsaninja

PR at https://github.com/typeshed-internal/stub_uploader/pull/61

hauntsaninja avatar Sep 22 '22 07:09 hauntsaninja

Reopening this for now, as we'll still need to rework a few test scripts on the typeshed side for this to work (but hooray! 🥳)

AlexWaygood avatar Oct 11 '22 11:10 AlexWaygood

I can have a stab at getting mypy_test.py and regr_test.py to work with non-types dependencies. I don't know where to start when it comes to getting pyright and pytype to work, though.

AlexWaygood avatar Oct 21 '22 18:10 AlexWaygood

I don't know where to start when it comes to getting pyright and pytype to work, though.

Sorry if I misunderstand, but isn't it just a matter of installing non-tyshed-stub dependencies from METADATA.toml?

Couldn't a script be made for that, and re-used. Whether it's in a virtual environment for testing. Or run directly as a CI step before the type checker. (Since Pyright is ran from a github action, we can't tack this on an existing script).

Avasam avatar Oct 22 '22 02:10 Avasam

Couldn't a script be made for that, and re-used.

Quite possibly. My point was just that I haven't really looked at pytype_test.py, and don't really know how it works, whereas I know pretty well how mypy_test.py and regr_test.py work :)

AlexWaygood avatar Oct 22 '22 11:10 AlexWaygood

@AlexWaygood I'd love to see this completed. Is there anything I could help with to free up some of your (and others') time?

Avasam avatar Dec 16 '22 03:12 Avasam

I'd love to see it completed as well, but just haven't had the time to work on it lately, with all the other typeshed-related things going on :(

If you fancy taking a look at getting pytype_test.py and/or pyright to work with non-types dependencies, that would be a huge help! I don't think all the scripts/workflows need to be updated in one PR -- we can make changes on the basis of "this would make this script compatible with non-types dependencies, if we had any" for now. PRs updating mypy_test.py and/or regr_test.py would also be very welcome -- I haven't had the time to seriously look at them yet :(

AlexWaygood avatar Dec 16 '22 10:12 AlexWaygood