lexgen icon indicating copy to clipboard operation
lexgen copied to clipboard

Avoid redunant backtrack state updates

Open osa1 opened this issue 2 years ago • 1 comments

In the Lua lexer I see code like

'>' => {
    self.0.set_accepting_state(Lexer_ACTION_13);          // 2
    match self.0.next() {
        None => {
            self.0.__done = true;
            match self.0.backtrack() {                    // 6
                ...
            }
        }
        Some(char) => match char {
            '>' => {
                self.0.reset_accepting_state();           // 12
                match Lexer_ACTION_31(self) {
                    ...
                }
            }
            '=' => {
                self.0.reset_accepting_state();           // 18
                match Lexer_ACTION_11(self) {
                    ...
                }
            }
            _ => match self.0.backtrack() {               // 23
                ...
            },
        },
    }
}

Here we don't need to set accepting state in line 2. Lines 6 and 23 call the semantic action set in line 2, and could call the function directly instead of going through backtrack. Lines 12 and 18 ignore accepting state entirely and call other semantic action functions.

I think if we move set_accepting_state calls to the branches it may become easier in the code generator to handle these cases. In that case the code above would look like:

'>' => {
    match self.0.next() {
        None => {
            self.0.set_accepting_state(Lexer_ACTION_13);       
            self.0.__done = true;
            match self.0.backtrack() {                
                ...
            }
        }
        Some(char) => match char {
            '>' => {
                self.0.set_accepting_state(Lexer_ACTION_13);       
                self.0.reset_accepting_state();      
                match Lexer_ACTION_31(self) {
                    ...
                }
            }
            '=' => {
                self.0.set_accepting_state(Lexer_ACTION_13);       
                self.0.reset_accepting_state();     
                match Lexer_ACTION_11(self) {
                    ...
                }
            }
            _ => {
                self.0.set_accepting_state(Lexer_ACTION_13);       
                match self.0.backtrack() {        
                    ...
                }
            }
        },
    }
}

Now in generate_state_arm, we can avoid generating set_accepting_state if we backtrack or call another semantic action function. We also need to use peek instead of next.

osa1 avatar Feb 07 '22 13:02 osa1