grapheme-splitter icon indicating copy to clipboard operation
grapheme-splitter copied to clipboard

A JavaScript library that breaks strings into their individual user-perceived characters.

Results 8 grapheme-splitter issues
Sort by recently updated
recently updated
newest added

The symbol "\u200D\u2764\uFE0F\u200D" seems to be processed incorrectly. I can string together an endless count of that symbol and it always counts as one grapheme, until the chain is interrupted...

## Using emojis like ๐Ÿ‘ฉโ€๐Ÿฆฐ๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ๐Ÿณ๏ธโ€๐ŸŒˆ ``` var splitter = new GraphemeSplitter(); var graphemeCount = splitter.countGraphemes('๐Ÿ‘ฉโ€๐Ÿฆฐ๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ๐Ÿณ๏ธโ€๐ŸŒˆ'); console.log(graphemeCount) ``` ## Result: `4`

**Input:** '๐Ÿ™„๐Ÿ˜‚โค๐Ÿ˜œโœŒ๐Ÿ‘' **Output:** [ '๐Ÿ™„', '๐Ÿ˜‚', 'โค', '๐Ÿ˜œ', 'โœŒ', '๐Ÿ‘' ] (actual) [ '๐Ÿ™„', '๐Ÿ˜‚', 'โ™ฅ', '๐Ÿ˜œ', 'โœŒ', '๐Ÿ‘' ] (what it looks like in code) For some reason when...

Hi there, first of all, thanks a lot for this library and the efforts you put in! I've got a scenario, where some emojis seem to be split up the...

เค…เคจเฅเคšเฅเค›เฅ‡เคฆ should return the 4 strings ["เค…", "เคจเฅ", "เคšเฅเค›เฅ‡", "เคฆ"] and not ["เค…","เคจเฅ","เคšเฅ","เค›เฅ‡","เคฆ"]. Basically how the cursor acts in the string. The cursor skips over the 4 characters or graphemes...

- [x] upgrade source code to ES2017 and transpile using babel - [x] implement UAX 29 [Extended Grapheme Clusters Segmentation](http://www.unicode.org/reports/tr29/tr29-33.html#Grapheme_Cluster_Boundaries) on Unicode 11 The change should be a breaking change...

@orling For my personal interest on Unicode, I would like to do a refactor of this library, here is some thoughts come to me: - [x] Transcribe the whole library...

Thanks for your lib, it is very helpful. However I am experiencing issues with Khmer language and the combining mark [U+17D2](https://codepoints.net/U+017D2?lang=en) (See: https://r12a.github.io/scripts/khmer/block#char17D2) which is specific to Khmer language and...