mypy_primer icon indicating copy to clipboard operation
mypy_primer copied to clipboard

Feat: Support detecting dependencies of projects being scanned

Open kkirsche opened this issue 1 year ago • 3 comments

Good morning,

This issue is to add support for detecting dependencies of the project(s) being scanned by MyPy.

Use Case

The use case of this feature is to understand the impact of a scan better when evaluating the results in typeshed pull requests.

Behavior

The recommended behavior of mypy_primer is to add support for an optional argument, either positional or flag-based, which accepts one or more package names. These package names represent the package being evaluated, such as types-requests. As typeshed packages are published under the pattern types-{package}, this would be used to determine which package was modified in this change.

With this change implemented and a package provided, while mypy_primer is scanning individual packages, it will evaluate whether or not the package being scanned uses that dependency, providing the end user with a percentage of projects scanned that use this dependency. If mypy_primer supports a verbose run mode, this will instead provide a list of scanned packages with each package's individual status.

Enhancements

This behavior can be enhanced, at the cost of additional complexity, by evaluating the package using a coverage-focused approach, determining if the changed APIs in a pull request are used within the package rather than simply looking for the dependencies.

Approaches

There seem to be a few different approaches we could take for this, depending on the longer-term intent of a feature like this. I've listed the three that immediately come to mind.

  1. modulefinder (not recommended)
    • https://docs.python.org/3/library/modulefinder.html
    • modulefinder can execute individual scripts locating dependencies used by that. This can be used to scan individual package files, evaluating which dependencies are used by it. modulefinder achieves this behavior using an import_hook.
  2. metadata via pip
    • Re-implement a minimal version of pip's search_packages_info to retrieve the requires field of the project's metadata.
  3. metadata via filesystem
    • Depending on how mypy_primer is working with the projects, it may instead make sense to read metadata from the project's configuration files (such as pyproject.toml, setup.py, setup.cfg, etc.)
  4. AST Evaluation
    • If a coverage-based solution is desired, the approach begins to become more complicated and may make sense to be a separate tool or a plugin for a separate tool, such as flake8. A low-level approach would be to scan the source code of a project and evaluate its AST to determine which packages are being imported and how the package is being used.

There certainly may be more approaches, I'd be interested in any feedback you may have about what approach you feel makes the most sense.

Who Will Do This?

I'm happy to attempt to provide this, though there will be some delays as I am currently assisting my family with something offline. This is why I haven't been able to be as involved in typeshed as I would like following my discussion with @AlexWaygood.

Thank you for your time.

kkirsche avatar Oct 12 '22 11:10 kkirsche