jochre
jochre copied to clipboard
OCRed text cut too narrow - how to correct or report
in this snippet an nun went missing. (or instead of nor) most probably the text-width of the scanned book was set too narrow. Should I just add the nun, or should I report such cases to github?
https://archive.org/stream/nybc203972#page/n176/mode/1up
No, do not correct such cases, because the text will no longer correspond to the image within the word boundaries. It's better to report this on Github. Hopefully such issues will be vastly reduced in Jochre 3, better segmentation being our first objective.