chars
chars copied to clipboard
Allow effective searching for flags and other zwj-joined symbols
Turns out we can't find, e.g., the transgender flag (new in unicode 13!) - its codepoints are
U+1F3F3
U+FE0F
U+200D
U+26A7
U+FE0F
...meaning we can only find the constituent codepoints, but not the whole. That's a problem for all kinds of flags, family configurations and other glyphs composed of multiple codepoints.
The sequences have names, so we ought to be able to retrieve them.
There's a list of unicode emojis and emoji zwj sequences here:
https://www.unicode.org/Public/emoji/13.0/emoji-sequences.txt and https://www.unicode.org/Public/emoji/13.0/emoji-zwj-sequences.txt - I imagine we have to integrate this in chars_data as a separate data set.
I think the internal character representation will have to grow into an enum (or get another variant with the sequence representation) - with adjusted display functions to go with it.