lark
lark copied to clipboard
Feature idea: Rule element labels
Hello everyone,
I'm quite new to lark and I was wondering if there is any planned feature on adding rule element labels, similar to ANTLR, into lark.
In ANTLR4 you are able to add these kind of labels:
expression : left=expression OPERATOR right=expression;
This way we are able to access the left expression using context.left instead of context.expression(0).
Cheers, Paulo Santos
What if expression is not resulting in a single child (left=(expr*))?
This seems interesting and doable.
Hmm, in that case left could probably be an empty list.
Perhaps a first step could be only allowing "single expressions" (without *, +, ?) and then, if possible, extend the behaviour to these.
I think it's a nice idea, but I'm worried that it would clutter the grammar.
Perhaps a cleaner approach, with a similar effect, would be to add the method Tree.children_by_name(name), which will return every Tree child with data==name. We can also make a Tree.child_by_name(name) that expects to find only one child.
Another option would be to add a feature to v_args, something like:
@v_args(keyword=True)
def expression(self, left, right, **other_keyword):
...
Where the order wouldn't matter, since everything is given as keywords.
There's still the one vs many nuance, but I'm sure we could figure it out.
One quality of live feature I already had in mind before this issue, is making the Tree object more magical: make tree[0] equivalent to tree.children[0], same for iterators. Then we can also add a __getattr__ method that gets one/all children with that data attribute.
I don't think I'd want the default Tree to become too magical, that would be confusing.
But luckily Lark supports overriding the default with tree_class=...
Coming back to this, I realize my suggestion wasn't an actual alternative. But perhaps something like this might do?
expression : left OPERATOR right
- left: expression
- right: expression
That would allow something like:
@v_args(keyword=True)
def expression(self, left, right, OPERATOR):
...
@erezsh I like the syntax, but how would you implement it? The only way I can currently think of involves non trivial extra analysis and tracking inside a single BNF rule, e.g. each symbol would have to remember where it comes from and what name group it belongs to.
@MegaIng I don't see why it would be any different than the | syntax, except with /\s+-/ as the operator. And then adding an optional alias property to Symbol. Doesn't sound like a big deal. The v_args(keyword=True) decorator can do the aggregation by data.
I can't say for sure that it's worth it, but it's not infeasible or especially complicated (I think).
@erezsh So you would for the moment only allow direct rule names? Also, currently v_args does not have access to the list of rules originally used. So we either need to add that information to the Tree (probably inside meta) or make v_args lark instance specific.