grmtools
grmtools copied to clipboard
Add support for comments in lrlex files
I started using lrlex and didn't find a formal definition of its file format but it seems there is no way to write comments in .l files. I think that would be a useful feature to support.
Generally lrlex has focused on following the posix lex specification (but hasn't implemented comments yet if I recall), my reading I didn't see any specific documentation of the comment format in lex, it is a little awkward, but generally mimics c-style comments but must be preceded by whitespace to avoid ambiguity with regexes. Except in cases where it is just a block of c code that gets copied verbatim.
Here are some examples, the latter showing some of the cases where initial whitespace is not required: https://cs.gmu.edu/~henryh/330/Lex/comments.html https://pubs.opengroup.org/onlinepubs/9699919799/utilities/lex.html#tag_20_65_17
Agreed, it would be good if lrlex supported comments, and I'd happily take a PR to that effect! If flex supports them, I might also be inclined to support //
comments, but I don't feel strongly about it.
So, I had noticed the text in #325 the following text in the posix lex spec:
Any such input (beginning with a or within "%{" and "%}" delimiter lines) appearing at the beginning of the Rules section before any rules are specified shall be written to lex.yy.c
Which is relevant to this bug, it would be easy enough to currently just ignore any line which starts with a space.
It seems like if we tried to emit these verbatim into generated sources, they would be in the middle of a vec![ Rule::new(), ..]
.
In order to actually emit them we'd need to change uses of struct Rule
to something like enum RuleOrVerbatim<StorageT>{ Verbatim(String), Rule(Rule<StorageT>)}
, as currently there isn't anywhere for them in our AST of the lex source format.
I believe Rule
is public but #[doc(hidden)]
and otherwise documented as unstable, so perhaps changing it to an enum is acceptable. But let me know if there are preferences here between emitting these verbatim or ignoring them entirely.