Can we correctly understand TypeScript with only a lexer?

Open jafl opened this issue 5 months ago • 1 comments

This example seems like it requires a parser, not just a lexer:

function identity<Type>(arg: Type): Type {
  return arg;
}

let output = identity<string>("myString");

Without knowing what comes after < in the last line, it seems to be impossible to tell if it's a comparison operator or a generics declaration.

Oct 14 '25 02:10 jafl

In a generics declaration, can we expect there to be exactly one token between < and >? In such a case, reading two tokens when seeing < is enough.

If TypeScript supports type expressions like <CONTAINER<TYPE>>, reading one more token is not enough. If I can remember the design of the TypeScript parser correctly, too much backtracking can be an issue.

The C++ parser has struggled with >> for years. The C++ parser reads tokens forward much. The parser stores the tokens in a nested doubly linked list. After reading, the parser analyzes the list.

I made tokeninfo.[ch] that extracted the essence of tokening code I studied from various parsers. Tokeninfo works fine for languages that have rather simple grammar. For more complicated grammar, I have wanted to extract the essence of the tokening code from the C++ parser.

Oct 14 '25 19:10 masatake