Parser icon indicating copy to clipboard operation
Parser copied to clipboard

Add `Node` union

Open MichaReiser opened this issue 2 years ago • 5 comments

Add a new Node union that is an enum over all node types. This can be useful when implementing methods that can operate on any node. For example, Ruff's formatter has (roughly) the API format(node: AnyNode) -> String

I haven't figured out the full requirements yet, and I don't know yet if we'll need both the owned and reference versions:

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
enum Node {
	IfStmt(ast::IfStmt),
	ConstantExpr(ast::ConstantExpr),
	...
}

#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash, is_macro)]
enum NodeRef<'a> {
	IfStmt(&'a ast::IfStmt),
	ConstantExpr(&'a ast::ConstantExpr)
	...
}

Note: An alternative naming more in line with Path and PathBuf would be to name the types NodeBuf (owned) and Node (reference)

The enums should have the following basic methods:

  • if_stmt(self) -> Option<ast::IfStmt>
  • as_if_stmt(&self) -> Option<&ast::IfStmt>
  • const is_if_stmt(&self) -> bool

Node could also implement AsRef that returns a NodeRef

I may have time to work on this sometime soon but I wanted to put this up for discussion first to get feedback.

MichaReiser avatar May 17 '23 06:05 MichaReiser

  1. Is it not covered by Node trait? If you need a concrete type of union, it will not be covered. If you need only a common method interface, it will be enough.

  2. Does it need to include all of the nodes, not the top level nodes? I think the top level ast.AST node in Python matches to Rust

pub enum AST {
    Stmt(ast::Stmt),
    Expr(ast::Expr),
    ...
}

Then IfStmt and ConstantExpr will be under each variant. Do you need to flatten all the nodes?

youknowone avatar May 17 '23 09:05 youknowone

Is it not covered by Node trait? In our use case, we need to call the formatting function for the specific node variant, extract all fields and then perform the formatting. We could implement this with the node trait in a suboptimal way if it has a downcast_ref(type) -> Option<T> method that downcasts the node to the given type. However, this would require that Ruff calls downcast_ref for every possible type until it finds the right type. Having an enum would allow us to implement this static-dispatch to the right node.

Does it need to include all of the nodes, not the top level nodes? I think the top level ast.AST node in Python matches to Rust

I haven't thought about it much but we can probably do either and both approaches have pros and cons:

  • top-level: Fewer variants to handle per level.
  • flat:
    • Easier to get to a specific node without nested matches (you can use as_constant_expr directly
    • Should be smaller because we avoid nested enums.

MichaReiser avatar May 17 '23 09:05 MichaReiser

Because adding top-level nodes to the root type AST makes consistent interface, how about adding it first and see if it is enough or not? We can add a new one if it is not enough.

youknowone avatar May 17 '23 13:05 youknowone

What do you mean by top level? Do you mean the 'Stmt' and Expr (ond others) enums or something else?

We can also add this to ruff_python_ast first and upstream it when we've figured out the api and you believe that this would be useful for RustPython too

MichaReiser avatar May 17 '23 17:05 MichaReiser

In ast module of python, nodes directly inheriting AST. (= T.__base__ == ast.AST) This is the full list.

  • Mod
  • Stmt
  • Expr
  • ExprContext
  • Boolop
  • Operator
  • Unaryop
  • Cmpop
  • Comprehension
  • Excepthandler
  • Arguments
  • Arg
  • Keyword
  • Alias
  • Withitem
  • MatchCase
  • Pattern
  • TypeIgnore

youknowone avatar May 18 '23 02:05 youknowone