ink icon indicating copy to clipboard operation
ink copied to clipboard

Support CJK Unified Ideographs?

Open Myonmu opened this issue 2 years ago • 1 comments

I'm currently trying to add Japanese characters to supported character set but I have some concern for adding Kanji.

In Unicode the block we are interested in is CJK Unified Ideographs, which, actually contains characters from Chinese, Japanese and Korean: link

The problem is the complete set contains 20992 characters, both common and rare, would that cause performance issue?

I managed to extract JIS X 0208 characters (around 6000 Kanji) but using CharacterRange.Define is painful as there are lots of "holes" that need to be removed by using exclude:. Using a file to enumerate these characters is surely another option.

Finally, since characters can be shared between Chinese, Japanese and Korean in the CJK block, it might just be convenient to add the whole block, if it doesn't heavily impact performance.

Attachment : an ordered set of JIS X 0208 Kanji characters. JIS0208Kanji.txt

Myonmu avatar Nov 20 '23 14:11 Myonmu

Side note: I compiled inklecate with the complete CJK range and plugged it in Inky, it works without noticeable lag. However Inky's autocomplete doesn't recognize them, but autocomplete is already having trouble with latin extend so... I don't know if that should be fixed.

Myonmu avatar Nov 20 '23 15:11 Myonmu