tree-sitter-c icon indicating copy to clipboard operation
tree-sitter-c copied to clipboard

Improve accuracy of escape sequences in string and character literals

Open narpfel opened this issue 9 months ago • 0 comments

There are a few cases where the current escape sequence parser is inaccurate, which causes misleading syntax highlighting.

  • hexadecimal-escape-sequence does not have a maximum length, so '\x00000000a' is the same as '\n'. Godbolt link: https://godbolt.org/z/fq18PbPc4
  • octal-escape-sequence can only contain octal digits [0-7], so "\749" is "<9" and the 9 should not be highlighted as part of the escape sequence. Godbolt link: https://godbolt.org/z/rEcM88had
  • Any other character except the valid escape sequence characters are not allowed in escape sequences, so they should not be highlighted as escape sequences. This PR introduces invalid_escape_sequence so that invalid escape sequences can be highlighted.

narpfel avatar Mar 15 '25 14:03 narpfel