Combining Lexer Match Actions and Token Remapping
From the examples, one can have actions when a lexical rules matches:
@_(r'\d+')
def NUMBER(self, t):
t.value = int(t.value) # Convert to a numeric value
return t
One can also remap tokens:
ID = r'[a-zA-Z_][a-zA-Z0-9_]*'
ID['if'] = IF
ID['else'] = ELSE
These cannot be combined, since if you define a function to perform an action, the next remap attempt raises an error:
TypeError: 'function' object does not support item assignment
What is the recommended way to use both of these techniques in a lexical token?
I assume the function could examine the value of the match (say, the string in ID) with something like if t.value == 'if', but how to return a different token?
The two techniques can't be combined. In fact, the whole token remapping feature was meant to replace the need for writing a function like this (which was commonplace):
keywords = { 'if', 'else', 'while' }
@_(r'[a-zA-Z_][a-zA-Z0-9_]*')
def ID(self, t):
if t.value in keywords:
t.type = t.value.upper()
return t
As shown in the function, the token type can be changed by assigning a different value to t.type.
Great, thanks!