Can we correctly understand TypeScript with only a lexer?
This example seems like it requires a parser, not just a lexer:
function identity<Type>(arg: Type): Type {
return arg;
}
let output = identity<string>("myString");
Without knowing what comes after < in the last line, it seems to be impossible to tell if it's a comparison operator or a generics declaration.
In a generics declaration, can we expect there to be exactly one token between < and >?
In such a case, reading two tokens when seeing < is enough.
If TypeScript supports type expressions like <CONTAINER<TYPE>>, reading one more token is not enough. If I can remember the design of the TypeScript parser correctly, too much backtracking can be an issue.
The C++ parser has struggled with >> for years.
The C++ parser reads tokens forward much. The parser stores the tokens in a nested doubly linked list. After reading, the parser analyzes the list.
I made tokeninfo.[ch] that extracted the essence of tokening code I studied from various parsers. Tokeninfo works fine for languages that have rather simple grammar. For more complicated grammar, I have wanted to extract the essence of the tokening code from the C++ parser.