zero-epwing icon indicating copy to clipboard operation
zero-epwing copied to clipboard

how to convert {{w_xxxxx}} and {{n_xxxxx}} to unicode

Open deathrush opened this issue 5 years ago • 3 comments

According to 外字Unicodeマップ http://ebstudio.info/manual/EBWin4_man/0_4_5.html map file content looks like hA121 u00E0,there is no 'w' or 'n'

deathrush avatar Nov 17 '18 07:11 deathrush

Those are indices into the character map for the given dictionary. Yomichan-Import has code to parse these entries, you can check it out here: https://github.com/FooSoft/yomichan-import/blob/master/epwing.go#L172

Character tables have to be created for every EPWING dictionary, since certain 外字 have glyphs that would normally be rendered inside the text.

FooSoft avatar Nov 20 '18 21:11 FooSoft

Character tables have to be created for every EPWING dictionary

Is that what you mean by a character table?

zA577	u95BD		#	閽
zA578	u8772		#	蝲
zA579	u6A1D		#	樝
zA57B	u95AB		#	閫
zA57C	u95D0		#	闐
zA57D	u9F97		#	龗
zA57E	u5B7D		#	孽
zA621	u97DB		#	韛
zA622	u65F0		#	旰
zA623	u74EB		#	瓫

Because if that's the case, installing EBWin4 and browsing to C:\Users\username\AppData\Roaming\EBWin4\GAIJI gives you a lot of tables. There's a table for kojien, wadai, meikyou, daijirin,...

epistularum avatar Jan 14 '20 02:01 epistularum

@FooSoft Noticed that OCR of 外字 for several main dictionaries are done in yomichan-import. Would you mind to kindly suggest or share how the OCR can be done in batch? I would like to contribute to the repo but get stuck in the OCR part...

issue-ocr

Thanks in advance!

playHing avatar Feb 22 '20 11:02 playHing