videocr icon indicating copy to clipboard operation
videocr copied to clipboard

Tesseract output improvement

Open halfguru opened this issue 5 years ago • 1 comments

Hi,

First of all, thank you for your work. I was looking for OCR projects since it's very difficult to find english subtitles for chinese youtube shows.

I'm wondering if you've attempted to optimize the Tesseract output with different image processing techniques as illustrated here. The use_fullframe argument could be changed to specific rectangular coordinates. Also, the Tesseract wiki indicates a dark text with light background is preferable so adding an option to invert the colors could be helpful. Binarisation could also help further isolate the subtitles. Finally, I believe adding the --psm 6 option to the Tesseract config to indicate a single uniform block of text would be beneficial.

halfguru avatar Oct 01 '19 20:10 halfguru

@halfguru These are really good insights. In the year since you've posted this, have you found any better solutions? I have the same use case as you (reading chinese soft captions).

mongy910 avatar Sep 22 '20 07:09 mongy910