lex-parser icon indicating copy to clipboard operation
lex-parser copied to clipboard

RE group production emits a non-captured group

Open ericprud opened this issue 1 year ago • 0 comments

For deep lexer rules like ShEx's PN_CHARS_BASE, the emitted rule has an enormous number of capture groups. When parsing a large input like FHIR.shex gives a stack error:

/home/eric/checkouts/shexSpec/shex.js/packages/shex-parser/shex-parser.js:251
      throw errors[0];
      ^

RangeError: Maximum call stack size exceeded
    at String.match (<anonymous>)
    at JisonLexer.next (/home/eric/checkouts/shexSpec/shex.js/node_modules/@ts-jison/lexer/lib/lexer.js:225:37)
    at JisonLexer.lex (/home/eric/checkouts/shexSpec/shex.js/node_modules/@ts-jison/lexer/lib/lexer.js:269:22)
    at JisonLexer.lex (/home/eric/checkouts/shexSpec/shex.js/node_modules/@ts-jison/lexer/lib/lexer.js:274:25)
    at lex (/home/eric/checkouts/shexSpec/shex.js/node_modules/@ts-jison/parser/lib/parser.js:51:28)
    at JisonParser.parse (/home/eric/checkouts/shexSpec/shex.js/node_modules/@ts-jison/parser/lib/parser.js:68:30)
    at ShExJisonParser.runParser [as parse] (/home/eric/checkouts/shexSpec/shex.js/packages/shex-parser/shex-parser.js:231:22)
    at Object.<anonymous> (/home/eric/checkouts/shexSpec/shex.js/parseFhir.js:10:38)
    at Module._compile (node:internal/modules/cjs/loader:1119:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1173:10) {
  parsed: null
}

Eliminating capture groups fixes the problem and makes parsing wayyyy faster.

(I generated this grammar using ts-jison, but the same happens with jison.)

ericprud avatar Mar 18 '23 18:03 ericprud