rules_python
rules_python copied to clipboard
Strict rules that check dependencies and type annotations
Will we have strict rules that check for deps and type annotations? i.e. py_strict_library
, py_strict_binary
, py_strict_test
.
check for deps
Can you elaborate? Not sure what is meant.
type annotations
Then py_strict*
would be a macro also instantiating a type annotation rule?
At google we have (actually) pytype_strict_{binary,library}
to check for typings using https://github.com/google/pytype, and check for strict dependencies, i.e. depend on build targets for what you import.
As I realized this involves pytype, I suppose this could be a quite some work...
People are using macros to do this sort of thing right now, and Bazel Aspects alternatively. In future I think there should be a 'paved road' for writing type-annotated Python in Bazel, though can't say yet whether there'd be rules like py_strict_binary
in rules_python
that execute pytype
, mypy
, or pyre
.
(I'm the author of the strict deps checking rule at Google)
"Strict deps" is a term we use to mean if you have "import foo" in your code, then some direct dependency provides "foo". It helps prevent your code from breaking when its dependencies's dependencies have changed.
The strict deps checking and pytype checking are separate implementations. We just bundle them into a single macro because usually people want both (both have their own separate stand alone macros, too).
The core design of how strict deps works would probably mostly work for Bazel built Python code. It has to make some assumptions with how things work, of course. We've also had to work around a variety of short comings in the Python providers that starlark currently has.
wrt naming: having macro names that reflect a particular aspect/validator being enforced also comes with a combinatorical drawback. You end up with py_strict_, pytype_, mypy_, pytype_strict_, mypy_strict_*, etc etc. While it works, and there's a certain appeal to having "what it does" right there in the rule name, it's also a headache for new comers ("Which of the 12 rules do I use?") and naming. I wonder if there's a better way (the particular idea in my head is: what if I want to make sure my code works with pytype and mypy and pyre and strict deps and etc mypy_pytype_pyre_strict_library?).
Ok I see that there is open source consideration here that different type checkers may be used. So strict_deps and type checking are definitely two separate considerations.
I wonder if the type checkers can all be hooked into a single entry point (not sure if it makes sense to use more than one type checker at the same time but) so the work can be hidden behind py_library:
py_library(
...
type_checkers = ['pytype']
...
)
Maybe rather than having a string have a Provider like PyTypeCheckerInfo
or so. This way you are able to pass custom checkers, etc.
Passing a string probably isn't a good idea because it would require the callee to have a mapping of names to functions, e.g., py_library
would have to have to depend on and load all possible implementations. Instead passing some sort of function/object would make more sense. But, the basic idea of having a type_checkers
arg I like. Or maybe validators
(to mirror the "_validation" name bazel gave to such actions), since strict deps (and various other static-analysis checks) aren't about type checking. In any case, I encourage thinking about static analysis type validators in general instead of type checkers specifically, e.g. checking that you don't have unused dependencies, basic syntax check, checking for SQL injections; checking incorrect format strings -- the sort of checks that other languages use compiler hooks for.
Anyways...
Here's a bit of a brain dump for some Bazel things that would make implementing strict deps much easier. These are various assumptions it makes in our implementation. Fundamentally, it's all about mapping a target to the module names, package names, and package attributes that the target provides.
- rule-type specific providers. e.g. PyBinaryInfo, PyTestInfo, PyLibraryInfo. This would allow it to distinguish between rule types without having to hard code against the Target.kind string name. The distinction matters because of some subtle differences in how they get treated.
- Information/mechanism to transform a file name to module name. This basically boils down to three things:
- Source and bytecode suffixes (.py, .pyc, etc), e.g. the importlib.machinary.*_SUFFIXES values
- Information about the "import roots" (PYTHONPATH); I think this corresponds to the workspace root?
- A way to get the module names a rule directly provides, and if those are packages or modules. The package vs module distinction is important for strict deps. This also ends up being necessary for non-trivial rules; e.g. those that rely on runtime logic that affect the importable names like sys.modules trickery etc.
- Info (probably a provider of some sort) of what's in the runtime's stdlib, specifically the module names, and packages and package attrs in the runtime's stdlib.
- Info (probably in the same runtime provider as above) of what the runtime's Python version is; this is so that we know how to parse the file (e.g. a 2.7, 3.6, 3.8 etc parser).
In our implementation, most of those points above are hard coded in some way (e.g., we have a file with a bunch of info about the stdlib, we make various assumptions when transforming a file path to a module name, etc).
+1 to py_strict_* functions and would love pytype support too. These two features by themselves make Bazel a really appealing option for Python code, plus the ability to easily generate par files (I suspect the Python community might otherwise ignore Bazel).
Something like the above, but with labels so it supports custom checkers seems like a good middle ground.
py_strict_library(
...
type_checkers = ["@rules_python//type_checkers:mypy"],
)
Then a couple macros could be provided:
load("@rules_python//type_checkers/defs.bzl", "py_mypy_strict_library")
py_mypy_strict_library(...)
That just add in a type_checkers
list and pass everything along.
@fahhem This looks very much like what Google does. I would much rather be able to provide a global setting for all Python libraries for the type checkers. If you have a large project, inserting the type_checkers in every single target will become quite redundant real fast. Ditto re. strict targets.
And what about linting? The problem with requiring lint to pass is that one needs to be able to disable that until ready to commit. That smells to me also like a global setting.
This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!
This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"
Is this issue resolved? I don't seem to find an implementation in the thread.
I will re-open. But since a different issue was filed in bazel/bazel we may not be able to implement here.
This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!
Bumping so it doesn't close this issue
Bump. Any plans to get this done anytime soon?