normcap icon indicating copy to clipboard operation
normcap copied to clipboard

Improve detection accuracy

Open dynobo opened this issue 6 years ago • 1 comments

Most of the detection quality depends on Tesseract OCR and the input image.

But there are also options/ideas in scope of this project:

  • Automatically preprocess screenshot before feeding into OCR
  • Tune Tesseract OCR parameters
  • Apply heuristics to correct common OCR mistakes
  • Improve "Magics" in detecting & transforming the OCR results

I'm happy about any ideas or pull request on this topic!

dynobo avatar Dec 21 '19 10:12 dynobo

The awesome OCRmyPDF is using this optimizations, maybe they will serve as an inspiration.

sojusnik avatar Sep 12 '21 22:09 sojusnik