yomichan-import icon indicating copy to clipboard operation
yomichan-import copied to clipboard

Various 広辞苑 bugs

Open Thermospore opened this issue 3 years ago • 2 comments

  1. over a thousand entries with �� or as the headword
  2. some headwords need this boxed A thing removed image
  3. same bug as number 3 in issue #27 image
  4. there are a lot of broken looking entries with a ○ at the start of the headword

Thermospore avatar Mar 12 '21 08:03 Thermospore

I'm new to the EPWING format. Guessing number 1 is caused by those charming image fonts 🙂 I'm willing to help map them out. Looks like 広辞苑 has a shit ton though. Maybe bulk OCR, then manually confirm one by one? image

Thermospore avatar Mar 12 '21 09:03 Thermospore

Ah yes, that would be the image fonts. The problem is they don't necessarily have to correspond to things you would find in fonts (most are normal characters, but there are random exceptions for symbols). The process of mapping them out often includes finding reasonable substitutions for glyphs that don't exist. Help mapping the missing ones would be much appreciated!

FooSoft avatar Mar 12 '21 21:03 FooSoft