docspell
docspell copied to clipboard
Feature Request: "Try to get the orientation right"
As far as I know it is a trick to do a OCR 4 times and changing the orientation of the document. The scan with the best OCR (most hits in a dictionary scan?) is the "right" orientation. Paperless-ng seems to do that trick. Is it possible to integrate in docspell?
Thanks
lopiuh
Thanks for reporting. This is a quite expensive trick, but simple and works of course. For reference there is #554 with similar requests. Changing orientation should be done automatically and manually at some point. It is possible to integrate it into docspell; I want to put more thought into this first.
Thanks, i did a manual orientation change on a pdf with linux "pdf arranger" and interestingly ocr was not better, maybe it is no real image rotating but a stored info "orientation" which get changed by changing orientation. Scanning the same document in correct orientation gives the correct ocr by the way. maybe there is a flag to tell the used ocr tools to use the saved orientatin information of a pdf (if my reasoning is right)
UPDATE 21-07-07: Testing again with another pdf: OCR was correct done after rotating with software pdf arranger (linux). Don't know what the problem was with the first sample which paperless got ocred but docspell not...