syntax
syntax copied to clipboard
Support only tokenizer code generation
Lexical grammar can already be specified as a separate file, however at code generation a tokenizer is always embedded into the generated parser file. We need to support generating a module for a standalone tokenizer (so it can be required as from any other parser).
./bin/syntax --lex ~/lexer.l --tokenizer-only --output ~/Tokenizer.js
Which can be required later in any parser:
const Tokenizer = require('Tokenizer');
const lexer = new Tokenizer('a b c');
console.log(lexer.getNextToken()); // {type: '...', value: 'a'}
console.log(lexer.getNextToken()); // {type: '...', value: 'b'}
...
Should support API from the custom tokenizer section:
// initString is supported to reuse the same tokenizer instance
lexer.initString('x y z');
console.log(lexer.getNextToken()); // {type: '...', value: 'x'}
...
We can use standard template for the tokenizer at code generation.