ochre About OCR_aligned and Lost or missing text

About OCR_aligned and Lost or missing text

Open USTCHJY opened this issue 6 years ago • 3 comments

Hi, I'm working on the OCR post-correction tasks and Ochre really helps me a lot. But I still have some questions looking forward to your reply. When using the Ochre for OCR post-correction tasks,we only have the OCR_input . So how can I get OCR_aligned from OCR_input without gs? Otherwise,how to deal with the Lost or missing text without aligned text? Thanks!

Jul 07 '18 02:07 USTCHJY

The task ochre performs is a supervised machine learning task. So, without gold standard, you can't create aligned data or train a (supervised) model.

Jul 09 '18 08:07 jvdzwaan

Sorry,maybe I expressed not clearly. I mean after supervised training(for training data,we must have gold standard),how can I use this trained ochre model for actual OCR post-correction tasks? Because for actual tasks,we usually don't have gold standard and desire to get corrected text which similiar to the gold standard. On this occasion,how can I get OCR_aligned from the raw OCR_input of the actual tasks? Thanks！

Jul 09 '18 08:07 USTCHJY

The README specifies how to use a trained model to do post correction: https://github.com/KBNLresearch/ochre#ocr-post-correction

If you want to calculate performance for this text, you'd still need to have ground truth/gold standard.

Jul 17 '18 10:07 jvdzwaan

ochre ochre copied to clipboard

About OCR_aligned and Lost or missing text

ochre
ochre copied to clipboard