⭐ Suggestion

It would be super helpful to have some sort of plugin system in ast-grep. There are many different ways to think about this, but it could look like:

ast-grep defining a trait which plugins must adhere to (maybe it takes in serialized value, produces a match result, list of ranges, etc.)
plugins build their own object which implements that trait, and compile it to a shared object
ast-grep users can register this plugin, and ast-grep will then load the shared object in and use it to do custom rules/matching

💻 Use Cases

For example, we could hook in an LSP symbols provider to get something like the following:

id: find-calls-to-react-components
language: tsx
rule:
  pattern: $COMPONENT($ARGS)
constraint:
  COMPONENT:
    definitionMatches:  # match against a symbol's *definition*
      pattern: function $NAME() { $$$ }
      has:
        # we assume a function that returns JSX is a component
        pattern: return <$X $$$ />
fix: <$COMPONENT ...$ARGS />

it could allow us to get semantic information about a symbol, rather than just syntactic information

Apr 29 '25 19:04 ribru17

Related to

https://github.com/ast-grep/ast-grep/issues/334

https://github.com/ast-grep/ast-grep/issues/433

May 02 '25 18:05 HerringtonDarkholme

(Just a note: these are similar requests, but still different from this one, which would allow for arbitrary new features depending on the plugin)

May 06 '25 16:05 ribru17

Several implementation questions:

how to parse unknown fields? https://github.com/de-vri-es/serde-ignored-fields/
how to register custom field schema?
how to load plugin? wasm? dylib? or builtin?
what data should plugin receive, what outpu

May 06 '25 16:05 HerringtonDarkholme

Maybe we could do it like constraint, allowing arbitrary field names for children, and their values are always a rule?

id: find-calls-to-react-components
language: tsx
rule:
  pattern: $COMPONENT($ARGS)
constraint:
  COMPONENT:
    pluginMatchers: # names of plugins, mapped to their rule
      definitionMatches:  # match against a symbol's *definition*
        pattern: function $NAME() { $$$ }
        has:
          # we assume a function that returns JSX is a component
          pattern: return <$X $$$ />
fix: <$COMPONENT ...$ARGS />

This is basically answered by 1., we can apply the same schema style as we do to constraint
I was thinking dylib, but wasm is also an interesting idea. I'd need to do more research on how to do this/what would be best, to be honest
I think as input, if it just received the SgNode(s) from the previous matched rule (in this example, COMPONENT and ARGS this would be enough information to do a lot of stuff). E.g. read node's document range to query for the symbol's definition

May 06 '25 16:05 ribru17

+1 to this!

We are building off of ast-grep with a bunch of custom logic (using rust APIs directly) and this could be a great way for us to contribute back.

Relevant links:

GitLab Code Parser: https://gitlab.com/groups/gitlab-org/-/epics/17516
Ruby Parser example with basic data flow tracing: https://gitlab.com/gitlab-org/rust/gitlab-code-parser/-/merge_requests/1
Knowledge Graph (the larger goal): https://gitlab.com/groups/gitlab-org/-/epics/17514

May 06 '25 17:05 michaelangeloio

I really like the idea of having constraints being extensible via plugins. There are cases when analyzing Angular code where I need to look at another file in order to tell something about some symbol in the current file. For example, an identifier is a component if in another file there's a class with the same name that has a @Component() decorator on it. When analyzing components, we might want to tell whether a certain variable is used in the template of that component or not (defined in a separate file). constraints extensibility would I think make it possible to do that stuff really easily.

For loading, part of me leans towards wasm cause in CI environments trying to compile a dylib and load it has been tricky already (compiling a treesitter grammar to use as a custom language requires setting things up weirdly). extism is a good library for wasm plugin systems. However, wasm would limit native tool access and might limit integrations with things like language-specific tooling.

May 14 '25 16:05 samwightt

I was thinking dylib, but wasm is also an interesting idea. I'd need to do more research on how to do this/what would be best, to be honest

I think as input, if it just received the SgNode(s) from the previous matched rule (in this example, COMPONENT and ARGS this would be enough information to do a lot of stuff). E.g. read node's document range to query for the symbol's definition

Question number 3 and number 4 are related. Both wasm/dylib needs a stable ABI. Designing the input/output may impact the decision of choosing the runtime of a plugin.

May 25 '25 01:05 HerringtonDarkholme

[feature] Plugin system

⭐ Suggestion

💻 Use Cases