ogonek
ogonek copied to clipboard
A C++11 library for Unicode
Implement the [GSM 03.38](http://www.unicode.org/Public/MAPPINGS/ETSI/GSM0338.TXT) encoding form.
Implement the Unicode collation algorithm as per [UAX #10](http://www.unicode.org/reports/tr10/).
Implement a tool for importing the data from the [Default Unicode Collation Element Table](http://www.unicode.org/Public/UCA/6.2.0/allkeys.txt).
Implement the line breaking algorithm as per [UAX #14](http://www.unicode.org/reports/tr14/).
Implement the sentence boundary algorithm as per [UAX #29](http://www.unicode.org/reports/tr29/#Sentence_Boundaries).
When using encode_one and decode_one, stateful encodings can end up with unfinished code units/code points that won't be in the result, but instead will be stored in the state for...