Raman Gupta comments

Results 219 comments of


                                            Raman Gupta

Simulated duplex scanning with page re-ordering

It doesn't support that currently, but it wouldn't be too difficult to add.

Simulated duplex scanning with page re-ordering

Hmm, the downside of this is that the script becomes "interactive" to some extent, because it has to scan the first set of pages, and then the second. It would...

Improve OCR layer compatibility with MacOS Preview via hocr renderer

I tried to replicate this problem on Linux with `pdfarranger`, but was unable to -- the edited PDF remained searchable. You said that switching from `pdfunite` to `gs` did *not*...

Improve OCR layer compatibility with MacOS Preview via hocr renderer

Maybe the Fujitsu software outputs a different character encoding? Can you upload a small sample scanned and OCRed with the Fujitsu software?

Improve OCR layer compatibility with MacOS Preview via hocr renderer

Any updates on this @watou ?

Improve OCR layer compatibility with MacOS Preview via hocr renderer

No further information provided from user, closing. Feel free to re-open if you have additional info to provide.

Improve OCR layer compatibility with MacOS Preview via hocr renderer

@watou Thanks for the update. The information doesn't help me unfortunately. What I would need from you ideally is an upload of a PDF scanned and OCRed via this project,...

Improve OCR layer compatibility with MacOS Preview via hocr renderer

I've replicated the issue with your (non-manager) scan, as well as another I did locally. I'll continue to do a bit of research to see if there is something simple...

Improve OCR layer compatibility with MacOS Preview via hocr renderer

I also replicated the issue by *not* using OCR in sane-scan-pdf, and instead adding the OCR layer post-scanning via [OCRmyPDF](https://ocrmypdf.readthedocs.io/en/latest/). Same issue when adding OCR via that tool using the...

Improve OCR layer compatibility with MacOS Preview via hocr renderer

Probably the easiest option is to just use ocrmypdf as an alternative to directly using tesseract. We could use this if, for example `--ocrmypdf` were passed as an argument instead...