coco
coco copied to clipboard
Propagate column numbers from parser to AST
Parser doesn't know them. This needs an overhaul of lexer and/or token structure.
Right.
Some research:
lexer.co outputs tokens in the form [TAG, value, lineNumber], which is converted to jison variables in by coco.co:
parser import
yy : require \./ast
lexer :
lex : -> [tag, @yytext, @yylineno] = @tokens[++@pos] or ['']; tag
setInput : -> @pos = -1; @tokens = it
upcomingInput : -> ''
grammar.co passes yylineno to certain rules with the L function:
Chain:
o \ID -> Chain L Var $1
compiled to:
case 1:this.$ = yy.Chain(yy.L(yylineno, yy.Var($$[$0])));
where L is defined in ast.co:
exports.L = (yylineno, node) -> node import line: yylineno + 1
The default lexer for jison keeps track of column location in yylloc, per the bison spec. See also https://github.com/zaach/jison/issues/59
Thus, for column numbers, the following needs to be done:
lexer.coneeds to keep track of column numbers for each token, e.g.[TAG, value, lineNo, colStart, colEnd]coco.coneeds to setyylocproperly from the lexer's token format.grammar.coneeds to passyylocinto the AST nodes it creates somehow, perhaps implicitly as part of the grammar DSL.
Those steps will propogate column numbers far enough to allow carp compile-time errors to have column information.
For source maps, more work would have to be done, since coco AST compiles straight to javascript code as strings. If coco AST instead compiled to JS AST, decorated with source line and column information, then the source map generation task could be offloaded to a separate code generator, such as escodegen. Factoring out the JS code generation could also make optimization/cleanup/shaping of the compiled code easier, such as #115.