ochre Working without aligned file

Working without aligned file

Open omrishsu opened this issue 7 years ago • 2 comments

Hi I’m conducting research regarding OCR corpuses, and I would like to use this project for evaluation of how differences on the training corpus effects the quality of the post-processing. But, I have OCR files and GS files without the aligned JSON file that needed. There is a way to generate it (maybe a smith waterman algorithm?) or work without it?

Thanks Omri

Jan 14 '18 15:01 omrishsu

Thank you for your interest in ochre! Whether you need the aligned files depends on what you want to do (how you want to calculate performance). For calculating character error rate and word error rate, you don't need them. For doing word level error analysis, you need them, but if you use the workflows provided by ochre, they are generated automatically.

I am in the process of putting the workflows online and providing documentation. So, I hope you can wait a little longer.

Is your dataset publicly available? If so, I'd like to include it in my list :)

Jan 16 '18 21:01 jvdzwaan

Hi, Sorry for disappearing (working on another research). I've updated my question in a separate post: https://github.com/KBNLresearch/ochre/issues/4 Thanks!

Feb 24 '18 08:02 omrishsu

ochre ochre copied to clipboard

Working without aligned file

ochre
ochre copied to clipboard