Uwe Hartwig
Uwe Hartwig
Hello, are there any news regarding this issue? It even seems if a file gets copied via plain `shutil.copy` modifications to the copy are mirrored on the source file, which...
Sorry, I didn't realize your latest remarks at first sight! But anyway, it's all only a proposal. If you can't go with, it's okay and just leave it out.
Additionally, I do use the information from physical containers. We have often custom labeled containers alike `Leerseite` or `Colorchecker` ( :slightly_smiling_face: ) on this area.
If Image has been skipped due logical / physical mismatch, there's no FULLTEXT existing, and nothing linked in the physical container, too.
It analyzes the METS and filters images by defined labels like the logical ones from DFG-structset like `cover_front` and `cover_back` and custom physical annotations like `Colorchecker` , `Leerseite`, `Illustration` and...
How to balance between convenient short feedback when ocr-d used as framework and external usage as a library? seems to be the main question. Other topics: * How to remove...
IIRC, at GT discussions it was said that these chars are normalized, as well as punctuations are tied to preceeding char without extra space. Maybe @tboenig can bring more light...
From a knowledge archaeology point of view it is good enough to have at least some OCR-results, rather than loosing a whole document with several thousand pages. One could even...
@tboenig No, we ~~are~~ try to to differentiate between relative simple newspaper layouts and really tricky ones for announcements and table-like data (timetables, stock news). We are using this at...
When working with Transkribus-SWT to generate GT my colleagues and I found ourselves several times running into trouble because we forgot to synchronize text line and word contents. The major...