jison-lex icon indicating copy to clipboard operation
jison-lex copied to clipboard

Parse error when using arrow function in rules

Open Mutefish0 opened this issue 8 years ago • 3 comments

It will parse error:

let grammar = {
    lex: {
        rules: [
            ['\\s+', ''],
            ['\\d+', () => 'NUMBER'],
            ['\\+', () => '+'],
            ['$', () => 'EOF'],
        ]
    },
    operators: [
        ['left', '+']
    ],
    bnf: {
        'es': [
            ['e EOF', 'return $1']
        ],
        'e': [
            ['e + e', '$$ = $1 + $3'],
            ['NUMBER', '$$ = Number(yytext)']
        ]
    }
}

while this is ok:

let grammar = {
    lex: {
        rules: [
            ['\\s+', ''],
            ['\\d+', function () { return 'NUMBER'}],
            ['\\+', function () { return '+' }],
            ['$',  function () { return 'EOF' }],
        ]
    },
    operators: [
        ['left', '+']
    ],
    bnf: {
        'es': [
            ['e EOF', 'return $1']
        ],
        'e': [
            ['e + e', '$$ = $1 + $3'],
            ['NUMBER', '$$ = Number(yytext)']
        ]
    }
}

Mutefish0 avatar Oct 16 '17 11:10 Mutefish0

The lexer generator (jison-lex) specifically looks for the return 'LABEL' pattern in the lexer rule action code blocks to replace the returned string with a token (number) when the lexer is combined with a grammar. This MAY be the cause of your trouble, though the parser run-time kernel has (IIRC) code to map token string to token number after the fact to cover any such lexer token return slip-ups before they enter the grammar parser proper.

While I say this, I wonder why that bit of code apparently doesn't kick in in your grammar/circumstances, so further diagnosis is required to answer this one without hand-waving like I do now.

GerHobbelt avatar Oct 23 '17 00:10 GerHobbelt

Upon further diagnosis this turns up: the code generator specifically looks for the function () {...} pattern if the rule action is defined as a function instead of a string and therefor does not (yet) support Arrow Functions as in your example above.

Relevant code snippet in regexp-lexer, taken from the GerHobbelt/jison fork (TODO comment added today):

        newRules.push(m);
        if (typeof rule[1] === 'function') {
            // TODO: also cope with Arrow Functions (and inline those as well?) -- see also https://github.com/zaach/jison-lex/issues/23
            rule[1] = String(rule[1]).replace(/^\s*function\s*\(\)\s?\{/, '').replace(/\}\s*$/, '');
        }
        action = rule[1];
        action = action.replace(/return\s*'((?:\\'|[^']+)+)'/g, tokenNumberReplacement);
        action = action.replace(/return\s*"((?:\\"|[^"]+)+)"/g, tokenNumberReplacement);

GerHobbelt avatar Oct 30 '17 15:10 GerHobbelt

FYI: this issue is now fixed in jison-gho (https://www.npmjs.com/package/jison-gho) since NPM build 0.6.1-211 i.e. 'build 211'.

(Your examples have been included as /examples/issue-lex-23*.js and altered versions for jison-lex specifically: '/packages/jison-lex/tests/spec/issue-23*.js`)

GerHobbelt avatar Dec 13 '17 02:12 GerHobbelt