makemeahanzi icon indicating copy to clipboard operation
makemeahanzi copied to clipboard

How do other sites draw so many characters that aren't in these dictionaries?

Open myarcana opened this issue 2 years ago • 2 comments

How does https://www.an2.net/zi/ draw 㠭 and 麤 and other rare and complex characters that aren't in MakeMeAHanzi ?

myarcana avatar Jan 07 '23 12:01 myarcana

Every character in Unicode has some visual representation. In fact they'll have multiple variants from the various character encodings that existed before. All that site is doing is using some font to render the glyph and adding their own background images to it. For glyphs both included and not included in Unicode, you might be able to find them on GlyphWiki

What I find most interesting is origin of the decomposition data. It seems like CHISE compiled a lot of decomposition data, but for some reason the raw dataset is no longer accessible. It seems like Gavin Grover's CJK decomposition was done independently, since it uses very different composition labels. Perhaps it was done algorithmically? I don't know.

It seems like the decomposition data in MakeMeAHanzi is somewhat off. It's better in some ways and worse in others.

For 不

type decomposition
GG 不:d/t(丆,卜)
CJK-IDS 不 ⿱一③
MMAH 不 ⿱一③

For 严

type decomposition
GG 严:d/s(亚,厂)
CJK-IDS 严 ⿳一④厂
MMAH 严 ⿻亚厂

For 丂

type decomposition
GG 丂:d/t(㇐,㇉)
CJK-IDS 丂 ⿱一㇉
MMAH 丂 ⿱一?

wiogit avatar May 04 '23 10:05 wiogit

You're right! I meant stroke order when I said "draw", it knows the stroke order of and draws the strokes one-by-one for many many characters that I can't find data for elsewhere online

myarcana avatar May 22 '23 21:05 myarcana