Fix go to definition and hover for files containing multibyte characters
Motivation
Closes #1251
When a Ruby file contains multibyte characters (like Japanese, Chinese, emoji, etc), the go to definition and hover features do not work correctly. The definition location or hover documentation will be incorrect. ref:
- https://github.com/Shopify/ruby-lsp/issues/1347
This is because the current implementation assumes single-byte characters when calculating offsets during index building and document referencing. We need to properly handle multibyte characters to ensure these features work reliably for all users.
Implementation
- Modified
RubyLsp::DocumentandRubyLsp::Requests::Requestto calculate locations considering multibyte characters. This change utilizes the API implemented in Prism by https://github.com/ruby/prism/pull/2406. - Added encoding to Entry and updated the logic to handle multibyte characters when storing locations. Additionally, since the location object has been changed from
Prism::Locationto RubyIndexer::Locationby https://github.com/Shopify/ruby-lsp/pull/1917, a new field has been added to RubyIndexer::Location. - By passing the encoding when creating the Index, the index will be built taking multibyte characters into account.
Automated Tests
I added a simple expectation test for the definition.
Manual Tests
I have signed the CLA!
Thank for you the contribution. From a quick read, it seems the PR is doing two main things:
- fixing handling for multibyte characters
- introducing support for encodings other than UTF-8
Since UTF-8 is almost always used, is it possible that the first can be addressed without the second?
@andyw8 Thank you for your review! As you pointed out, the changes can be broken down into multiple tasks. I'll close this Pull Request for now and focus on fixing the handling of multibyte characters first.
I opened a new PullRequest. https://github.com/Shopify/ruby-lsp/pull/2051