net-imap icon indicating copy to clipboard operation
net-imap copied to clipboard

Use String#byteindex instead of String#index

Open shugo opened this issue 1 year ago • 6 comments

Because String#index takes O(N), where N is the second argument, while String#byteindex takes O(1).

shugo avatar May 23 '24 03:05 shugo

We should wait until Ruby 3.1 gets EOL.

shugo avatar May 23 '24 03:05 shugo

@shugo I added it to a new v0.6 milestone. 🙂

nevans avatar May 23 '24 17:05 nevans

@gooroodev WDYT?

admsev avatar May 26 '24 21:05 admsev

Thanks, @admsev! Happy to help!

1. Summary of Changes

The pull request makes the following changes:

  • Replaces instances of String#index with String#byteindex.
  • Replaces instances of $~.end(0) with $~.byteoffset(0)[1].

2. Issues, Bugs, and Typos

Issue 1: Inconsistent Handling of Return Values

The original code uses String#index which returns the character index, while the new code uses String#byteindex which returns the byte index. This change can affect the handling of multibyte characters. However, the code seems to handle byte offsets correctly by using $~.byteoffset(0)[1].

Issue 2: Typo in Comment

The comment "an 'accept' versiun of #text" contains a typo. The word "versiun" should be "version".

Proposed Fix:

# an "accept" version of #text

Issue 3: Consistency in Method Naming

The method accept_spaces uses byteindex while other methods like next_token still use index in some places. For consistency, all instances should be updated to use byteindex.

Proposed Fix: Replace all instances of index with byteindex in the next_token method.

3. General Review of Code Quality and Style

Code Quality

  • Readability: The code is generally readable and well-commented. The use of regular expressions is appropriate for the task.
  • Performance: The switch to byteindex can improve performance in handling multibyte strings, which is crucial for IMAP responses that may contain multibyte characters.
  • Error Handling: The code has appropriate error handling with parse_error calls to manage unexpected tokens.

Style

  • Consistency: The code style is consistent with Ruby conventions. Method names are clear and follow the snake_case convention.
  • Documentation: The code is well-documented with comments explaining the purpose of each method and the meaning of regular expressions.

Proposed Code Improvements

Improvement 1: Consistent Use of byteindex

Current Code:

@str.index(/\S*/n, @pos)

Improved Code:

@str.byteindex(/\S*/n, @pos)

Improvement 2: Fix Typo in Comment

Current Code:

# an "accept" versiun of #text

Improved Code:

# an "accept" version of #text

Improvement 3: Handle Multibyte Characters Consistently

Ensure all instances of index are replaced with byteindex where necessary to handle multibyte characters correctly.

Conclusion

The pull request effectively updates the code to use String#byteindex instead of String#index, which is a positive change for handling multibyte characters. The proposed improvements should be implemented to maintain consistency and fix minor issues. Overall, the code quality and style are good, and the changes are well-justified.

Yours, Gooroo.dev. I'd love to hear your feedback! React or reply.

gooroodev avatar May 26 '24 21:05 gooroodev

@admsev @gooroodev please don't. It's obnoxious and not very helpful.

nevans avatar May 27 '24 14:05 nevans

I remember now that the argument of ResponseParser#parse is usually ASCII-8BIT, so this change doesn't matter unless ResponseParser#parse is directly used with other encodings.

shugo avatar Aug 13 '24 05:08 shugo