lark
lark copied to clipboard
Added parse_lark_grammar.py and the syntax parameter/registry
I think the core design here stands. I would like feedback on that already.
Primarily missing here is tests, examples and docs.
It needs a little tidying up but overall it looks alright.
A few things:
-
We already use '
' as a magic constant. I think anything else shouldn't concern us. -
RE_FLAGS isn't a good enough reason for having a recursive import.
-
I think the default for
syntax
should be"lark"
. And it shouldn't override the extension, if one exists. If anyone wants to augment the lark syntax, they can use "grammar.better_lark" or whatever.edit: So maybe it's better to rename it to
default_syntax
We already use '' as a magic constant. I think anything else shouldn't concern us.
??
RE_FLAGS isn't a good enough reason for having a recursive import.
Fair enough. For some reason I thought it was more than just that.
Currently the order is (syntax_hint, extension, "lark")
. You propose (extension, syntax_hint, "lark")
or (extension, default_syntax)
?
Oops sorry. I meant <string>
The last one. Unless you can think of a good reason to choose one of the others.
My idea was to always fall back to lark
in case people have grammars named .grammar
or .g
or something like that.
That reminds me, we should also because of this start loading files with different extensions.
That's a good point. I know some users had those for Lark 0.x
I guess my thinking is that it's okay to force them to use .lark
for 1.x?
If we do it your way, I have a few questions -
- what happens when someone does
%import foo (bar)
? Do we look for foo.g? And how do we know how to do it? If it's only forLark.open
, we can say syntax only overwrites that.
P.s. another option is that they can register_syntax('g', lark_parse_grammar)
or something of that sort
-
If we have
foo.ABNF
? Do we use lark and throw a confusing syntax error? -
If we %import foo and have
foo.abnf
butsyntax="lark"
, will it try to parse it as lark?
And a general question - what happens if we have both foo.lark
and foo.abnf
? Makes me wonder if perhaps we shouldn't try to guess file extensions automatically on import in the first place..