grmtools icon indicating copy to clipboard operation
grmtools copied to clipboard

Define lexemes in terms of other lexemes

Open bendrissou opened this issue 1 year ago • 2 comments

Hi,

It's not clear to me from the documentation whether it's possible to define lexemes in terms of other lexemes.

E.g.

[0-9] "DIGIT"
[a-zA-Z] "LETTER"
{LETTER}({LETTER}|{DIGIT})* "ID"

Is this Lex feature supported by grmtools?

Thank you.

bendrissou avatar Jun 25 '24 14:06 bendrissou

It's not supported. I would not object to a PR which added it!

ltratt avatar Jun 25 '24 14:06 ltratt

Also linking #417 since it is very much related, but posix lex uses a slightly different syntax for this.

DIGIT    [0-9]
ID       [a-z][a-z0-9]*
%%
{DIGIT}+ "DIGITS"

Would we really want both, interestingly they have slightly different semantics because the posix lex definition/substitute can say, avoid emitting a lexeme for DIGIT, only producing one for DIGITS.

{DIGIT}+ "DIGITS"
[0-9] "DIGIT"

While emitting them in terms of another lexeme would seem to, but also be somewhat more prone to issues involving overlap. I do see the appeal however!

ratmice avatar Mar 22 '25 20:03 ratmice