Sandro Mani

Results 119 comments of Sandro Mani
trafficstars

Thank you for working on this! I haven't yet looked in depth at flatpack, so currently I'm not really able to partecipate, but if there are specific issues, I'm happy...

I fear this is a general issue with PoDoFo and complex scripts - resp more work is needed have PoDoFo handle these correctly.

Actually, isn't it just a matter of picking the right font? I tried with a test image you sent me a while ago, installed the Lohit Devanagari font, selected that...

Ah I see. Do you have any idea how tesseract handles this?

Yeah I read the same thread - as I read it, PoDoFo isn't capable of handling it for you, but it should be possible to handle it with custom code...

But looking at the tesseract source, in particular [pdfrenderer.cpp](https://github.com/tesseract-ocr/tesseract/blob/master/api/pdfrenderer.cpp), I see no traces of pango or harfbuzz. It would be sufficient to figure out the low-level blocks that tesseract adds...

Okay I'll take a look when I find a moment.

@Shreeshrii I've added a QPrinter backend for PDF export, please give it a try.

Here you go: - 32 bit: https://smani.fedorapeople.org/tmp/gImageReader_3.2.3_qt5_i686.exe - 64 bit: https://smani.fedorapeople.org/tmp/gImageReader_3.2.3_qt5_x86_64.exe

1. Correct, hOCR is always page based (due to the nature of the hOCR format). While clearly a subset of a document can also be seen as a hOCR page,...