emojicode icon indicating copy to clipboard operation
emojicode copied to clipboard

sπŸ”‘ πŸ”β— now returns grapheme index instead of UTF-8 index

Open joeskeen opened this issue 3 years ago β€’ 4 comments

sπŸ”‘ πŸ”β— now returns grapheme index instead of UTF-8 index

Fixes #208

joeskeen avatar Nov 27 '22 05:11 joeskeen

Thanks for immediately opening a PR to fix this! Unfortunately, your implementation runs in O(n) in terms of memory as πŸ”ͺ allocates memory for each substring. Do you think you could maybe create an overload of 🎼 that takes a start index from which to compare? Then no substring would be needed.

thbwd avatar Nov 30 '22 14:11 thbwd

OK I'll look into using that approach. Thanks for the feedback!

joeskeen avatar Dec 02 '22 03:12 joeskeen

@thbwd I've started the approach you suggested, but it looks like I'm running into the same problem that caused the issue to begin with. Once I get over to the C++ side, the sStringBeginsWith code uses a memory comparison between string->characters.get(), beginning->characters.get(). There isn't enough context here to be able to determine where in the string to start comparing since we would have to get the graphemes, measure them, and then start the memory comparison at that calculated index. And since this would still be done in a loop, this code would have to be called multiple times.

I'm wondering if there would be a better approach involving passing not only the source string, but also the source string in grapheme array form, allowing that operation to only be done once.

joeskeen avatar Dec 16 '22 16:12 joeskeen

OK @thbwd I believe this is ready to be reviewed again.

joeskeen avatar Dec 16 '22 17:12 joeskeen