jochre icon indicating copy to clipboard operation
jochre copied to clipboard

problems with columns

Open mirjam-amsterdam opened this issue 5 years ago • 3 comments

ocr didn't manage to recognise the braking of lines in a text with two columns. see screenshot. problem with text in columns

mirjam-amsterdam avatar Aug 31 '19 19:08 mirjam-amsterdam

Thanks, we're working on the training corpus and software for Jochre 3, and including Zalman Reyzen's lexicon in the training corpus for segmentation. If you find any other badly segmented works, report them here, and we'll include them in the training corpus.

urieli avatar Sep 02 '19 12:09 urieli

Found one, coincidently also a Reisen! But Avrom https://tinyurl.com/reisen-eybike-sheynhayt middle hit. A book with poems in two columns. goes wrong in search results and in OCRed view of tekst. Examples is: https://www.yiddishbookcenter.org/collections/yiddish-books/spb-nybc200201?book-page=76&book-mode=1up eybike sheynhayt   tsu a bild

mirjam-amsterdam avatar May 08 '20 09:05 mirjam-amsterdam

found another, also from Reisens leksikon (leksikonfund00rejz)

two colums in one

mirjam-amsterdam avatar Jun 19 '23 13:06 mirjam-amsterdam