grapheme-splitter issues

Heart symbol not processed correctly

5

The symbol "\u200D\u2764\uFE0F\u200D" seems to be processed incorrectly. I can string together an endless count of that symbol and it always counts as one grapheme, until the chain is interrupted...

dhoelzl

splitter.countGraphemes('👩‍🦰👩‍👩‍👦‍👦🏳️‍🌈') = 4

7

## Using emojis like 👩‍🦰👩‍👩‍👦‍👦🏳️‍🌈 ``` var splitter = new GraphemeSplitter(); var graphemeCount = splitter.countGraphemes('👩‍🦰👩‍👩‍👦‍👦🏳️‍🌈'); console.log(graphemeCount) ``` ## Result: `4`

stephen147

Is there a way to prevent emojis from turning into ASCII symbols?

2

**Input:** '🙄😂❤😜✌👍' **Output:** [ '🙄', '😂', '❤', '😜', '✌', '👍' ] (actual) [ '🙄', '😂', '♥', '😜', '✌', '👍' ] (what it looks like in code) For some reason when...

skeddles

Emojis splitted up unexpectedly (e.g. https://emojipedia.org/ninja-cat/)

5

Hi there, first of all, thanks a lot for this library and the efforts you put in! I've got a scenario, where some emojis seem to be split up the...

ClemensSchneider

अनुच्छेद => अ नु च्छे द

5

अनुच्छेद should return the 4 strings ["अ", "नु", "च्छे", "द"] and not ["अ","नु","च्","छे","द"]. Basically how the cursor acts in the string. The cursor skips over the 4 characters or graphemes...

kotpal

- [x] upgrade source code to ES2017 and transpile using babel - [x] implement UAX 29 [Extended Grapheme Clusters Segmentation](http://www.unicode.org/reports/tr29/tr29-33.html#Grapheme_Cluster_Boundaries) on Unicode 11 The change should be a breaking change...

JLHwung

Refactor plan

4

@orling For my personal interest on Unicode, I would like to do a refactor of this library, here is some thoughts come to me: - [x] Transcribe the whole library...

JLHwung

Support for Khmer language (non spacing mark U+17D2 COENG)

8

Thanks for your lib, it is very helpful. However I am experiencing issues with Khmer language and the combining mark [U+17D2](https://codepoints.net/U+017D2?lang=en) (See: https://r12a.github.io/scripts/khmer/block#char17D2) which is specific to Khmer language and...

bbalet

grapheme-splitter
grapheme-splitter copied to clipboard

Metadata

Heart symbol not processed correctly

splitter.countGraphemes('👩‍🦰👩‍👩‍👦‍👦🏳️‍🌈') = 4

Is there a way to prevent emojis from turning into ASCII symbols?

Emojis splitted up unexpectedly (e.g. https://emojipedia.org/ninja-cat/)

अनुच्छेद => अ नु च्छे द

Next - Unicode 11 support

Refactor plan

Support for Khmer language (non spacing mark U+17D2 COENG)

← Metadata

Owner

Metadata

grapheme-splitter grapheme-splitter copied to clipboard

Metadata

← Metadata

Owner

Metadata

grapheme-splitter
grapheme-splitter copied to clipboard