grapheme-splitter icon indicating copy to clipboard operation
grapheme-splitter copied to clipboard

Heart symbol not processed correctly

Open dhoelzl opened this issue 2 years ago • 5 comments

The symbol "\u200D\u2764\uFE0F\u200D" seems to be processed incorrectly. I can string together an endless count of that symbol and it always counts as one grapheme, until the chain is interrupted by another character.

splitter.countGraphemes("x\u200D\u2764\uFE0F\u200Dx\u200D\u2764\uFE0F\u200D\u200D\u2764\uFE0F\u200D\u200D\u2764\uFE0F\u200Dx") === 3

(I would expect 7)

dhoelzl avatar Jul 22 '22 12:07 dhoelzl

the example you've given is not a symbol, it is a symbol surrounded by zero-width-joiner codepoints the specific combinations you build with it may or may not be valid/defined by various specific implementations/unicode versions, but as an abstract concept "stringing together" an endless zero-width-joiner sequence is in fact indicating just one grapheme. that's the whole purpose of the zero-width-joiner.

anonghuser avatar Dec 11 '23 11:12 anonghuser

I don't know how zero-with-joiner exactly work, the only thing I know is that a browser renders this string as

x‍❤️‍x‍❤️‍‍❤️‍‍❤️‍x

where I visually count 7 graphemes.

dhoelzl avatar Dec 11 '23 11:12 dhoelzl

try and select them one by one in the browser. in mine, i can't. i have three parts i can select.

anonghuser avatar Dec 11 '23 14:12 anonghuser

In mobile Safari, i can select 7 distinct items.

ljharb avatar Dec 11 '23 15:12 ljharb

On Chrome 114 I can only select x‍❤️‍x‍❤️‍‍❤️‍‍❤️‍x as 3 segments (x‍❤️‍, x‍❤️‍‍❤️‍‍❤️‍, and x)

coder0107git avatar Jan 30 '24 21:01 coder0107git