vim-lsp-cxx-highlight icon indicating copy to clipboard operation
vim-lsp-cxx-highlight copied to clipboard

Chinese characters will cause incorrect highlighting

Open chrisniael opened this issue 3 years ago • 3 comments

Describe the bug If there is a Chinese character string, the highlighting of this line of code will be incorrect.

See line 7 of the screenshot for details

To Reproduce

#include <iostream>

class A {
 public:
  void dump() const {
    std::cout << "value=" << this->n_ << std::endl;
    std::cout << "数值=" << this->n_ << std::endl;
  }

 private:
  int n_;
};

int main() { A a; }

Expected behavior

Line 6 of the screenshot is what I expected.

Screenshots image

Configuration (Fill this out):

  • NVIM v0.4.4
  • coc.nvim

Log File:

Wed 13 Jan 2021 02:55:10 PM CST: lsp_cxx_hl beginning initialization...
Wed 13 Jan 2021 02:55:10 PM CST: vim-lsp not detected
Wed 13 Jan 2021 02:55:10 PM CST: LanguageClient-neovim not detected
Wed 13 Jan 2021 02:55:10 PM CST: coc.nvim successfully registered
Wed 13 Jan 2021 02:55:10 PM CST: nvim-lsp not detected
Wed 13 Jan 2021 02:55:14 PM CST: textprop nvim notify symbols for main.cpp
Wed 13 Jan 2021 02:55:14 PM CST: hl_symbols (textprop nvim) highlighted 16 symbols in file main.cpp
Wed 13 Jan 2021 02:55:14 PM CST: operation hl_symbols (textprop nvim) main.cpp took   0.004213s to complete

chrisniael avatar Jan 13 '21 07:01 chrisniael

Hi, I have reproduced the bug and unfortunately I'm not really sure I can solve it in vim-lsp-cxx-highlight.

The problem is that the column positions sent are based on character offsets but are then inconsistent with vim column numbers. If you hover over the first chinese character you'll see it starts at column 19 and then the second character starts at column 21 then the = starts at column 23. As a result this offsets the highlighting for everything on the same line after those characters.

This looks to be caused by the multi_byte (:h mbyte.txt) allowing wide characters to take up 2 columns. It could be something to do with vim trying to be consistent with how the terminal renders characters. Maybe there's a setting to change that but after digging through the help page for mbyte I can't figure out how to change it.

Both vim and nvim's highlight APIs use byte based positions which makes it not feasible to fix it with code since theres no efficient way of converting the character position to byte position. The only thing I could think of is scanning the line and figuring it out from that, but it would be very slow and complicated to do in vimscript.

I think the best option is try to figure out the multi byte settings in nvim, maybe there's channels where people might know more about this, or you can open a issue on vim/nvim to ask. Sorry that I'm not able to help with this.

As a side note, I can also reproduce a similar problem with LanguageClient's error highlighting: image

jackguo380 avatar Jan 14 '21 03:01 jackguo380

The only thing I could think of is scanning the line and figuring it out from that, but it would be very slow and complicated to do in vimscript.

Ok, so I did some research, read this https://github.com/neovim/neovim/issues/6161, and then I just tried this:

--- i/autoload/lsp_cxx_hl/textprop_nvim.vim
+++ w/autoload/lsp_cxx_hl/textprop_nvim.vim
@@ -19,8 +19,12 @@ function! s:buf_add_hl(buf, ns_id, hl_group,
     " single line symbol
     if a:s_line == a:e_line
         if a:e_char - a:s_char > 0
+                      let line = getbufline(a:buf, a:s_line + 1)[0]
             call nvim_buf_add_highlight(a:buf, a:ns_id, a:hl_group,
-                        \ a:s_line, a:s_char, a:e_char)
+                        \ a:s_line,
+                        \ byteidx(line, a:s_char),
+                        \ byteidx(line, a:e_char))
             return
         else
             return

And it works. Works just fine. Please kindly @jackguo380 assist me, where in the code base should I put this transformation, so that also vim part is affected and other nvim_buf_add_hightlight calls are affected to? I think my mind tells me to put it close to the source when it's the offsets are received, but maybe that's not the best position. If I will put something along map(a:symbols, transform_iffsets) inside lsp_cxx_hl#hl#notify_symbols will that be fine?

Kamilcuk avatar Oct 08 '21 22:10 Kamilcuk

Hey @Kamilcuk

That's great that you found a potential solution. Could you open a PR with the code so I can review it?

Just some suggestions:

  • The filtering could be done in a separate helper function. Maybe you can put it in parse.vim.
  • The implementation looks like it could have a performance impact. Maybe we can put it behind a feature flag.
    • Most users probably don't have unicode directly in their code, or it's isolated to lines where they don't care about it breaking their highlighting.

Thanks for looking into this.

jackguo380 avatar Oct 18 '21 08:10 jackguo380