ruff icon indicating copy to clipboard operation
ruff copied to clipboard

Meta issue: plugin system

Open sobolevn opened this issue 3 years ago • 59 comments

I think that the main thing why Flake8 is so popular is its plugin system. You can find plugins for every possible type of problems and tools.

Right now docs state:

Beyond rule-set parity, ruff suffers from the following limitations vis-à-vis Flake8:

1. Flake8 has a plugin architecture and supports writing custom lint rules.

I propose designing and implementing plugin API. This way ruff can compete with flake8 in terms of adoption and usability.

Plugin API

I think that there are some flake8 problems that should be fixed and also there are some unique chalenges that should be addressed.

  1. Explicit "opt-in" for plugins. Right now flake8 suffers from a problem when you install some tool and it has a flake8 plugin definition. This plugin is automatically enabled due to how setuptools hooks work. I think that all rules must be explicit. So, eslint's explicit plugins: looks like a better way.
  2. Special "fix" API and tooling: so many typical problems can be solved easily by plugin authors
  3. Plugin order. Since plugins will change the source code, we must be very strict about what order they run in. Do they run in parallel while checking files?
  4. Plugin configuration: current way of configuring everything in [flake8] section can cause conflicts between plugins. Probably, the way eslint does that is better
  5. Packaging: how to buld and package rust extensions? How to build wheels?

Please, share your ideas and concerns.

sobolevn avatar Sep 29 '22 09:09 sobolevn

Strongly agree. Will write up some thoughts on this later!

charliermarsh avatar Sep 29 '22 12:09 charliermarsh

The first big decision is: should plugins be written in Rust? Or in Python? I believe that either could be possible (though I haven't scoped out the work at all), e.g., using pyo3. It may even be easier to support plugins in Python given that loading code dynamically is much easier in a scripted language...

However, I'm partial to requiring plugins be written in Rust. It will lead to a more cohesive codebase, allow us to maintain a focus on performance, and avoid requiring extensive cross-language FFI. I'm open to being convinced otherwise here though.

Here are a few relevant resources on implementing a plugin system for Rust:

  • https://nullderef.com/blog/plugin-tech/
  • https://github.com/rodrimati1992/abi_stable_crates
  • https://zicklag.github.io/rust-tutorials/rust-plugins.html

One of the main challenges seem to be around the lack of ABI stability in Rust. In many of the above write-ups, they discuss how both the plugins and calling library need to use the same versions of Rust in order to be compatible, which feels like a tall order. (From that perspective, one thing that's interesting to me is: could we compile plugins to WASM?)

charliermarsh avatar Sep 29 '22 20:09 charliermarsh

I think that instead of Python based plugins, it is better to provide some kind of query language to make easy plugins very easy. Like in https://github.com/hchasestevens/astpath

In my opinion, any complex stuff should be in Rust. This way it can reuse existing APIs and be fast. But, I don't know how many Python developers actually know Rust 🤔

I think another way of dealing with it is to ask exisiting flake8 plugin authors about their prefered way of writting it. Their feedback would be very valuable!

sobolevn avatar Sep 29 '22 20:09 sobolevn

Interesting, Fixit / LibCST has something kind of like that too. It's not quite a distinct query language, but it's effectively a DSL (in Python) to pattern-match against AST patterns.

charliermarsh avatar Sep 29 '22 23:09 charliermarsh

My 2c. flake8's plugin support is pretty rudimentary (operates on tokens/lines plus a handfull of metadata). Therefore if you supported Python plugins, you likely could craft it in a way that supporting flake8 plugins out-of-the-box(-ish) would be feasible.

Then you could have most flake8 plugins available through ruff, and wouldn't need to support new plugins by porting the code to Rust (only if you want to because yummy yummy perf).

(lightly related to https://github.com/charliermarsh/ruff/issues/414)

thejcannon avatar Oct 25 '22 15:10 thejcannon

Another idea: we could build a plug-in system atop https://github.com/ast-grep/ast-grep. This would allow users to express lint rules in YAML or via a simple DSL.

charliermarsh avatar Oct 27 '22 12:10 charliermarsh

(That tool is itself built atop tree-sitter.)

charliermarsh avatar Oct 27 '22 12:10 charliermarsh

I’m coming to this from a position of having written a flake8 plugin for a very specific need at work, and as part of a larger project. This is not something generic, so it’d never make sense as a built-in feature in ruff. I’d love a way to write plugins to ruff in Python, mostly because it’s convenient as someone familiar with Python and not so much rust, but also because it would be nice to keep a project in pure Python even while interacting with rust.

The specific plugin in my case, oida, was first written as a standalone thing before I discovered how easy it was to add it as a plugin to flake8. It also uses LibCST, for its ability to round trip code, where we do codemodding for its. If it would be possible to expose a similar ast based Python-interface for plugins that would be awesome.

I also have use cases where I’d like to do auto fixing, which it would also be nice to support. The first thing I’d like to do is normalize import statements (relative vs absolute). In order to do that I’d need an interface where I get import statements or ast nodes and the path to the file so I can locate it in relation to other files on the system.

I understand that writing plugin in Python would be a slowdown compared to writing them in rust, but I think that tradeoff would be very much acceptable in many cases.

ljodal avatar Nov 08 '22 21:11 ljodal

Very helpful and all makes sense.

Maybe just as another data point for the thread: when I was at Spring Discovery, we wrote a few Flake8 plugins to enforce highly codebase-specific rules.

For example:

  • "Always late-import TensorFlow" (i.e., import it within a function that depends on it, rather than at module top-level)
  • "If you ever import module X, make sure the file also imported module Y"
  • "Imports to module Z should always use import from structure"

charliermarsh avatar Nov 08 '22 21:11 charliermarsh

So in that light, I think there are different categories of plugins:

  • Plugins that are custom to a codebase
  • Plugins that may apply to many codebases, but don't make sense to include directly in Ruff (e.g., Django-specific stuff could qualify here)

charliermarsh avatar Nov 08 '22 21:11 charliermarsh

I think most of those "custom" plugins / checks could be built atop something like ast-grep, but more complex checks (like rewriting absolute and relative imports) would be limited by that approach.

charliermarsh avatar Nov 08 '22 21:11 charliermarsh

The first thing I’d like to do is normalize import statements (relative vs absolute). In order to do that I’d need an interface where I get import statements or ast nodes and the path to the file so I can locate it in relation to other files on the system.

(Separately: this could arguably make sense to include in Ruff directly.)

charliermarsh avatar Nov 08 '22 21:11 charliermarsh

I think most of those "custom" plugins / checks could be built atop something like ast-grep, but more complex checks (like rewriting absolute and relative imports) would be limited by that approach.

Yeah, I wouldn’t really be able to implement any of Oida using ast-grep, as all the rules depend on the context of the project. I use in-process caching to keep that state ready between files in the current flake8 plugin btw, forgot to mention that above, so the flake8 interface isn’t ideal for that kind of plugin.

The first thing I’d like to do is normalize import statements (relative vs absolute). In order to do that I’d need an interface where I get import statements or ast nodes and the path to the file so I can locate it in relation to other files on the system.

(Separately: this could arguably make sense to include in Ruff directly.)

I guess some rules could be, again not for my specific case. What we’re considering at work is to enforce relative imports within a Django app and use absolute imports for everything else. Our structure will be project.component.app or project.app, so it would be very specific to our use case how that rule should be applied. I’ve already played with implementing it in isort, but I found that code base hard to navigate and would love a clean ast/cst based plugin interface where I could add this logic :)

ljodal avatar Nov 08 '22 22:11 ljodal

You asked for feedback from other flake8 plugin authors, so:

  • https://github.com/peterjc/flake8-black (620k downloads/month on PyPI), not needed if you can run black directly as well as running flake8, for example via the tool pre-commit or otherwise. Currently this reloads each Python file from disk (scope here to refactor to let black use its cache), it would not be possible to use the AST from flake8 directly. Does not make sense to plug into ruff.

  • https://github.com/peterjc/flake8-rst-docstrings/ (238k downloads/month on PyPI), uses the AST to extract docstrings, which are passed as strings to the Python library docutils to be validated as RST. My code is essentially a wrapper, and since docutils is written in Python that would have to be used internally if this plugin were to be ported to ruff.

  • https://github.com/peterjc/flake8-sfs (15k downloads/month on PyPI), uses the AST directly looking for particular kinds of node. Probably could be done in either Python or Rust, although unlikely to be popular enough to deserve including in ruff itself.

peterjc avatar Nov 14 '22 20:11 peterjc

Thank you @peterjc! Really appreciate your engagement here as a plugin author!

(Regarding RST: it looks like there's at least one Rust crate for parsing RST, though it doesn't look super popular.)

charliermarsh avatar Nov 15 '22 02:11 charliermarsh

Is this possible, or supported currently? https://github.com/adamchainz/flake8-tidy-imports

ofek avatar Nov 15 '22 03:11 ofek

@ofek - Not currently supported but it’s a pretty small surface area so should be easy to add some time in the next few days.

charliermarsh avatar Nov 17 '22 15:11 charliermarsh

Thanks! I've been enforcing absolute imports recently (except in tests) https://github.com/pypa/hatch/blob/b0911bb0eaa8d331c24eda940b97bf244ecd5ac3/.flake8#L8-L11

After that I'll switch over, and make new projects generated by Hatch use this.

ofek avatar Nov 17 '22 15:11 ofek

Sweet! The banned relative import rule I can definitely do today.

charliermarsh avatar Nov 17 '22 15:11 charliermarsh

@ofek -- I252 (banned relative imports) just went out in v0.0.125.

You can use it in Hatch by adding this to your pyproject.toml:

[tool.ruff]
select = [
  "B",
  "C",
  "E",
  "F",
  "W",
  # Ruff doesn't have this, but it does have E722.
  # "B001",
  "B003",
  "B006",
  "B007",
  # These don't exist in newer flake8-bugbear versions IIUC.
  # "B301",
  # "B305",
  # "B306",
  # "B902",
  "Q000",
  "Q001",
  "Q002",
  "Q003",
  "I252",
]
ignore = [
  "B027",
  # "E203",
  # "E722",
  # "W503",
]
line-length = 120
# tests can use relative imports
per-file-ignores = {"tests/*" = ["I252"], "tests/**/*" = ["I252"]}

[tool.ruff.flake8-tidy-imports]
ban-relative-imports = "all"

Let me know if it works, or doesn't! :)

charliermarsh avatar Nov 17 '22 17:11 charliermarsh

Thank you!!! https://github.com/pypa/hatch/pull/607

ofek avatar Nov 20 '22 17:11 ofek

@charliermarsh You wrote somewhere that libcst is significantly slower than the current ast implementation in ruff (can't find it right now). Do you know why? Is it because it's a cst or is it because the classes it exposes are Python "compatible"?

I'm asking because I've started looking into pyo3 and from what I see the only way to expose an ast to a Python plugin would be to make the ast classes Python classes in pyo3. If that's what's slow with libcst I guess there's not really any point in investigating that route too much, but if we could make that fast enough I guess it could be one way to make plugins work.

That doesn't resolve auto-fixing, but as I suggested in another thread I think maybe doing auto-fixing on the token level could be made to work. Maybe an interface like this:

def visit_Import(node: ast.Import, tokens: list[str]) -> list[str]:
    # Check ast (or tokens) for violations and return updated token
    return ["import", " ", "foo"]

Or maybe have tokens as an attribute on the ast nodes 🤔

ljodal avatar Nov 28 '22 17:11 ljodal

@ljodal - This was all based on LibCST as a Rust crate, with no Python FFI -- so I think it's just the CST and parser, and not anything to do with the the serialization. (I also hacked in some RustPython vs. LibCST benchmarks into the existing LibCST benches and got similar results. As with all benchmarking, though, I could definitely be doing something wrong!)

charliermarsh avatar Nov 28 '22 18:11 charliermarsh

@ljodal - I don't have great intuition for whether the PyO3 FFI would add much overhead and what the performance impact would be. I think it's worth exploring!

charliermarsh avatar Nov 28 '22 18:11 charliermarsh

Aight, then I'll continue investigating :)

I haven't written any rust before, so it's slow going (thinking of doing advent of code in rust to get a kickstart). My plan was to use the Python ASDL definitions to generate AST classes, but it's been years since last I touched compilers so I'll have to see how I go about the tokenization and conversion to ast

ljodal avatar Nov 29 '22 08:11 ljodal

You might be interested in some of the stuff in RustPython -- they generate the Rust AST definitions from an ASDL file here and here.

charliermarsh avatar Nov 29 '22 21:11 charliermarsh

Nice, thanks for the links, I'll take a look :) I guess if I base it on that and just tweak it to be Python compatible it should be fairly easy to see the overhead as well

ljodal avatar Nov 29 '22 22:11 ljodal

Another idea: we could build a plug-in system atop https://github.com/ast-grep/ast-grep. This would allow users to express lint rules in YAML or via a simple DSL.

something like this would be useful to replicate eslint's https://eslint.org/docs/latest/rules/no-restricted-syntax which is handy for little one off things

sbdchd avatar Jan 04 '23 02:01 sbdchd

As a flake8 plugin author myself who is not presently acquainted with Rust, I think it's important that Ruff's plugin system support plugins implemented in Python. While I understand the benefits of simplicity and performance that would come with requiring plugins to be implemented in Rust, I suspect that the eventual ecosystem of Ruff plugins would ultimately suffer for it. I think this because Ruff is a linter for Python and as such, many, and probably a significant majority of its users, will come to Ruff not knowing Rust. As a consequence, if only Rust plugins are supported, when Ruff users who are not familiar with Rust find a need or idea for a new plugin, in the case of the former they may switch to Flake8, and in the case of the latter they may do the same or just not develop a plugin at all. Chances are, prospective plugin authors care first about having something that works and only second about whether or not it is performant. Thus, it seems prudent to support the tools that Ruff's future plugin authors likely already know. That being Python.

My own personal anecdote is that I'm the author of a Flake8 plugin for another one of my projects. Both these projects are small in scale at the moment, so take this with a grain of salt, but I think it's unlikely that I would author a plugin for Ruff if I had to do it in Rust unless and until Ruff became more popular than Flake8. While I like the idea of learning Rust in order to port my plugin, I just don't think I could justify taking the time to do so. Rather, I would prefer to maintain a generic set of linting heuristics that I could use in both my Flake8 plugin and a potential Ruff plugin. Doing so would both save on maintenance effort and allow users to choose the linting tool of their choice. While my plugin and project don't have many users, I expect that my rational would probably be even more applicable to large projects with maintainers who have even less time to spend supporting what is at present, a fairly niche linting tool like Ruff.

With all this said, if Ruff supported plugins in Python and Rust, I think Ruff plugins could be a gateway to learn Rust (if an unlikely one) since authors of Python plugins like myself may find a need or demand for performance in the future.

rmorshea avatar Jan 11 '23 01:01 rmorshea

I maintain a plugin that I would like to rewrite in rust, to be able to run it with ruff.

I just wanted to ask, what happens to flake8-plugins ported to ruff that need significant upgrading/maintenance for, e.g., new python versions; who has the maintenance burden once it has been ported? :slightly_smiling_face:

Would it be possible to port it to ruff (if deemed generally useful), then split out as a plugin if/once the architecture for that is in place?

sondrelg avatar Jan 19 '23 16:01 sondrelg