cairo icon indicating copy to clipboard operation
cairo copied to clipboard

Properly support position encodings

Open mkaput opened this issue 1 year ago • 1 comments
trafficstars

LS is not aware of and thus is not performing position encoding kind negotiation, which means the language client will assume UTF-16. Actually, LS is using UTF-8 because that's the native encoding of Rust strings. This means bad things can happen if files contain non-ASCII characters, especially ones like Emoji, which span multiple Unicode codepoints.

Things to implement:

  • Properly negotiate position encoding with the language client. Prefer UTF-8 to avoid re-encoding files, but fall back to UTF-16 as this is the only encoding guaranteed to be universally supported by clients.
  • When converting Cairo positions to LSP ones, take into account encoding differences. This will require knowing file source at conversion time, which will be a large refactoring.
  • Enforce UTF-8 encoding in E2E tests (add appropriate asserts in MockClient).

mkaput avatar Jun 11 '24 14:06 mkaput