rules_python icon indicating copy to clipboard operation
rules_python copied to clipboard

Strict rules that check dependencies and type annotations

Open dayfine opened this issue 4 years ago • 17 comments

Will we have strict rules that check for deps and type annotations? i.e. py_strict_library, py_strict_binary, py_strict_test.

dayfine avatar Apr 26 '20 04:04 dayfine

check for deps

Can you elaborate? Not sure what is meant.

type annotations

Then py_strict* would be a macro also instantiating a type annotation rule?

abergmeier avatar Apr 27 '20 08:04 abergmeier

At google we have (actually) pytype_strict_{binary,library} to check for typings using https://github.com/google/pytype, and check for strict dependencies, i.e. depend on build targets for what you import.

As I realized this involves pytype, I suppose this could be a quite some work...

dayfine avatar Apr 27 '20 17:04 dayfine

People are using macros to do this sort of thing right now, and Bazel Aspects alternatively. In future I think there should be a 'paved road' for writing type-annotated Python in Bazel, though can't say yet whether there'd be rules like py_strict_binary in rules_python that execute pytype, mypy, or pyre.

thundergolfer avatar Apr 28 '20 01:04 thundergolfer

(I'm the author of the strict deps checking rule at Google)

"Strict deps" is a term we use to mean if you have "import foo" in your code, then some direct dependency provides "foo". It helps prevent your code from breaking when its dependencies's dependencies have changed.

The strict deps checking and pytype checking are separate implementations. We just bundle them into a single macro because usually people want both (both have their own separate stand alone macros, too).

The core design of how strict deps works would probably mostly work for Bazel built Python code. It has to make some assumptions with how things work, of course. We've also had to work around a variety of short comings in the Python providers that starlark currently has.

wrt naming: having macro names that reflect a particular aspect/validator being enforced also comes with a combinatorical drawback. You end up with py_strict_, pytype_, mypy_, pytype_strict_, mypy_strict_*, etc etc. While it works, and there's a certain appeal to having "what it does" right there in the rule name, it's also a headache for new comers ("Which of the 12 rules do I use?") and naming. I wonder if there's a better way (the particular idea in my head is: what if I want to make sure my code works with pytype and mypy and pyre and strict deps and etc mypy_pytype_pyre_strict_library?).

rickeylev avatar Apr 28 '20 03:04 rickeylev

Ok I see that there is open source consideration here that different type checkers may be used. So strict_deps and type checking are definitely two separate considerations.

I wonder if the type checkers can all be hooked into a single entry point (not sure if it makes sense to use more than one type checker at the same time but) so the work can be hidden behind py_library:

py_library(
   ...
   type_checkers = ['pytype']
   ...
)

dayfine avatar Apr 28 '20 04:04 dayfine

Maybe rather than having a string have a Provider like PyTypeCheckerInfo or so. This way you are able to pass custom checkers, etc.

abergmeier avatar Apr 28 '20 09:04 abergmeier

Passing a string probably isn't a good idea because it would require the callee to have a mapping of names to functions, e.g., py_library would have to have to depend on and load all possible implementations. Instead passing some sort of function/object would make more sense. But, the basic idea of having a type_checkers arg I like. Or maybe validators (to mirror the "_validation" name bazel gave to such actions), since strict deps (and various other static-analysis checks) aren't about type checking. In any case, I encourage thinking about static analysis type validators in general instead of type checkers specifically, e.g. checking that you don't have unused dependencies, basic syntax check, checking for SQL injections; checking incorrect format strings -- the sort of checks that other languages use compiler hooks for.

Anyways...

Here's a bit of a brain dump for some Bazel things that would make implementing strict deps much easier. These are various assumptions it makes in our implementation. Fundamentally, it's all about mapping a target to the module names, package names, and package attributes that the target provides.

  • rule-type specific providers. e.g. PyBinaryInfo, PyTestInfo, PyLibraryInfo. This would allow it to distinguish between rule types without having to hard code against the Target.kind string name. The distinction matters because of some subtle differences in how they get treated.
  • Information/mechanism to transform a file name to module name. This basically boils down to three things:
    1. Source and bytecode suffixes (.py, .pyc, etc), e.g. the importlib.machinary.*_SUFFIXES values
    2. Information about the "import roots" (PYTHONPATH); I think this corresponds to the workspace root?
  • A way to get the module names a rule directly provides, and if those are packages or modules. The package vs module distinction is important for strict deps. This also ends up being necessary for non-trivial rules; e.g. those that rely on runtime logic that affect the importable names like sys.modules trickery etc.
  • Info (probably a provider of some sort) of what's in the runtime's stdlib, specifically the module names, and packages and package attrs in the runtime's stdlib.
  • Info (probably in the same runtime provider as above) of what the runtime's Python version is; this is so that we know how to parse the file (e.g. a 2.7, 3.6, 3.8 etc parser).

In our implementation, most of those points above are hard coded in some way (e.g., we have a file with a bunch of info about the stdlib, we make various assumptions when transforming a file path to a module name, etc).

rickeylev avatar Apr 28 '20 17:04 rickeylev

+1 to py_strict_* functions and would love pytype support too. These two features by themselves make Bazel a really appealing option for Python code, plus the ability to easily generate par files (I suspect the Python community might otherwise ignore Bazel).

blais avatar May 17 '20 18:05 blais

Something like the above, but with labels so it supports custom checkers seems like a good middle ground.

py_strict_library(
    ...
    type_checkers = ["@rules_python//type_checkers:mypy"],
)

Then a couple macros could be provided:

load("@rules_python//type_checkers/defs.bzl", "py_mypy_strict_library")

py_mypy_strict_library(...)

That just add in a type_checkers list and pass everything along.

fahhem avatar Jun 03 '20 18:06 fahhem

@fahhem This looks very much like what Google does. I would much rather be able to provide a global setting for all Python libraries for the type checkers. If you have a large project, inserting the type_checkers in every single target will become quite redundant real fast. Ditto re. strict targets.

And what about linting? The problem with requiring lint to pass is that one needs to be able to disable that until ready to commit. That smells to me also like a global setting.

blais avatar Jun 04 '20 04:06 blais

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

github-actions[bot] avatar Apr 14 '21 22:04 github-actions[bot]

This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"

github-actions[bot] avatar May 14 '21 22:05 github-actions[bot]

Is this issue resolved? I don't seem to find an implementation in the thread.

cnsgsz avatar Jun 24 '23 11:06 cnsgsz

I will re-open. But since a different issue was filed in bazel/bazel we may not be able to implement here.

chrislovecnm avatar Jun 24 '23 14:06 chrislovecnm

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

github-actions[bot] avatar May 03 '24 22:05 github-actions[bot]

Bumping so it doesn't close this issue

matts1 avatar May 06 '24 00:05 matts1

Bump. Any plans to get this done anytime soon?

nachumg avatar Jun 04 '24 08:06 nachumg