jochre
jochre copied to clipboard
Latin text not rendered in OCRed text
https://ocr.yiddishbookcenter.org/contents?doc=nybc202767#page24
Latin text within the Yiddish text is not rendered, but there also is no placeholder indicating that some text is missing and that the reader should go to the original scan (before preparing an e-book, or before quoting etc.)
Yes, I made the mistake in the early analyses to configure a "junk setting", which ignores text if the confidence score is too low. This means certain passages (typically other alphabets) are simply skipped. In the newer analyses this should no longer be the case. However, I'd rather wait for the new version of Jochre to fix this, as this version should be able to handle multiple alphabets.
Stumbled over a misreading: when searching for מאַנש I get a result that actually is in Latin letters Wien !
Please, do make Latin letters searchable and show them as Latin letters in the text. And don't treat me with false results when I am looking for Mansch...