acme-lsp icon indicating copy to clipboard operation
acme-lsp copied to clipboard

Correctly handle UTF-16 offsets

Open fhs opened this issue 5 years ago • 0 comments

LSP uses UTF-16 offsets:

A position inside a document (see Position definition below) is expressed as a zero-based line and character offset. The offsets are based on a UTF-16 string representation. So a string of the form a𐐀b the character offset of the character a is 0, the character offset of 𐐀 is 1 and the character offset of b is 3 since 𐐀 is represented using two code units in UTF-16.

Acme uses rune offsets. Currently, we treat the UTF-16 offsets as rune offsets (and vice versa) for an easier implementation, which is obviously wrong.

fhs avatar May 31 '19 19:05 fhs