powerquery-parser icon indicating copy to clipboard operation
powerquery-parser copied to clipboard

[Enhancement] Remove LexerSnapshot

Open JordanBoltonMN opened this issue 3 years ago • 2 comments

Is your feature request related to a problem? Please describe. LexerSnapshot existed due to old reasons which no longer apply. It doesn't have any real reason to exist anymore.

Describe the solution you'd like Update the parser to take a lexer state and read tokens directly from there.

Describe alternatives you've considered N/A

Additional context N/A

JordanBoltonMN avatar Aug 21 '20 17:08 JordanBoltonMN

Was lexer snapshot added to support incremental lexing? or can we still do that with state? I'm still hoping we'll eventually be able to support parser based tokenization + semantic highlighting.

mattmasson avatar Aug 24 '20 16:08 mattmasson

Originally it was added to help support it, but it no longer is needed.

The lexer creates tokens on a per-line basis, which requires some additional token kinds such as MultilineCommentStart and MultilineCommentEnd, Whenever one line gets updated it conditionally updates the subsequent lines. Eg. you started a multiline comment on one line then it turns the subsequent line into a MultilineCommentContent.

When you want to actually parse something you need try creating a LexerSnapshot. The snapshot attempt iterates over all of the tokens which provides:

  • Validate and combine multiline tokens into a single token
  • Put all comments into a collection
  • Provides a helper function to get a [lineNumber, columnNumber] pair from a Token

All of these could be moved into the parser and LexerSnapshot could be removed. A trade happens by slightly adding to the complexity of the parser, but also removes complexity of having the LexerSnapshot at all. It also removes an O(n) pass on tokens.

JordanBoltonMN avatar Aug 25 '20 04:08 JordanBoltonMN