Results 538 comments of Amit Dovev

@tbadran > But please note that words are not reversed while viewing the PDF because it contains the original image with text layer. > I mean when you copy text...

@roozgar You can try training Tesseract using the regular engine. Use the the wiki and see #169. I really don't know how good the result will be for Arabic. Like...

Tom, Look at the original jpg. Lines 2 and 4 in Google Chrome look quite similar to lines 2 and 3 in the original jpg. First word in line 3...

Again, in Google Chromium. If I mark the first two lines in the PDF + first word in line 3, copy the (invisible) text, paste it to a text file,...

@jbreiden I didn't understand you. In one comment you talk about Hebrew and in another one you only referring Arabic. Does Hebrew displayed correctly with Adobe Reader?

Please make sure that any change you do is not causing any regression with Chrome PDF viewer and OS X Preview. Thanks for your work!

Maybe explicitly using unicode bidi control characters can help ?

@jbreiden, any progress? Which way you chose? Personally, I care about our Hebrew support.

Hi @JKamlah, Leptonica has some built-in grayscale normalization functions, maybe we can also use them. https://github.com/DanBloomberg/leptonica/blob/0ffbc6822c23725b5b9f6876e2620a22ba3689f4/src/adaptmap.c Here are some examples that demonstrate how to use them to improve thresholding using...