paperwork icon indicating copy to clipboard operation
paperwork copied to clipboard

OCR does not work in background

Open Moini opened this issue 8 years ago • 4 comments

When I select a document, then select to redo OCR (actually, OCR seems to run on the documents for the first time then, see https://github.com/openpaperwork/paperwork/issues/722), then switch to another document, OCR of that document stops, and is only resumed when I reselect that document (at least, it looks like that to me).

This is with paperwork 1.2.2 (flatpak) on Linux Mint 18.3.

Moini avatar Dec 15 '17 20:12 Moini

Nah, don't worry, it keeps working in the background. Actually, there is even no way at all for Paperwork to suspend/stop the OCR (no API in Pyocr for that). You can make sure of that by running top in another terminal (or gnome-system-monitor for instance). You should see a process tesseract using 100% of one of your CPU.

jflesch avatar Dec 15 '17 23:12 jflesch

I get a different thing in top:

Click on document, select redo ocr, confirm. tesseract starts to run. Click on second document. tesseract still runs for a couple of seconds. Then stops. Re-select first document, tesseract starts running again. Animation with pages changing size and paper plane again. Only then tesseract finishes.

Moini avatar Dec 15 '17 23:12 Moini

Does that happen right after you tried importing documents, or did you restart Paperwork in the meantime ?

jflesch avatar Dec 16 '17 22:12 jflesch

I had to restart it, because it wouldn't show any documents (edit: except for the documentation pdf), probably due to that 'too many open files' issue.

Moini avatar Dec 16 '17 23:12 Moini