lark icon indicating copy to clipboard operation
lark copied to clipboard

Added parse_lark_grammar.py and the syntax parameter/registry

Open MegaIng opened this issue 3 years ago • 7 comments

I think the core design here stands. I would like feedback on that already.

Primarily missing here is tests, examples and docs.

MegaIng avatar Oct 15 '21 11:10 MegaIng

It needs a little tidying up but overall it looks alright.

A few things:

  • We already use '' as a magic constant. I think anything else shouldn't concern us.

  • RE_FLAGS isn't a good enough reason for having a recursive import.

  • I think the default for syntax should be "lark". And it shouldn't override the extension, if one exists. If anyone wants to augment the lark syntax, they can use "grammar.better_lark" or whatever.

    edit: So maybe it's better to rename it to default_syntax

erezsh avatar Oct 15 '21 12:10 erezsh

We already use '' as a magic constant. I think anything else shouldn't concern us.

??

RE_FLAGS isn't a good enough reason for having a recursive import.

Fair enough. For some reason I thought it was more than just that.

Currently the order is (syntax_hint, extension, "lark"). You propose (extension, syntax_hint, "lark") or (extension, default_syntax)?

MegaIng avatar Oct 15 '21 12:10 MegaIng

Oops sorry. I meant <string>

erezsh avatar Oct 15 '21 12:10 erezsh

The last one. Unless you can think of a good reason to choose one of the others.

erezsh avatar Oct 15 '21 12:10 erezsh

My idea was to always fall back to lark in case people have grammars named .grammar or .g or something like that.

MegaIng avatar Oct 15 '21 12:10 MegaIng

That reminds me, we should also because of this start loading files with different extensions.

MegaIng avatar Oct 15 '21 12:10 MegaIng

That's a good point. I know some users had those for Lark 0.x

I guess my thinking is that it's okay to force them to use .lark for 1.x?

If we do it your way, I have a few questions -

  • what happens when someone does %import foo (bar)? Do we look for foo.g? And how do we know how to do it? If it's only for Lark.open, we can say syntax only overwrites that.

P.s. another option is that they can register_syntax('g', lark_parse_grammar) or something of that sort

  • If we have foo.ABNF? Do we use lark and throw a confusing syntax error?

  • If we %import foo and have foo.abnf but syntax="lark", will it try to parse it as lark?

And a general question - what happens if we have both foo.lark and foo.abnf? Makes me wonder if perhaps we shouldn't try to guess file extensions automatically on import in the first place..

erezsh avatar Oct 15 '21 12:10 erezsh