tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

Support Inverted and non-inverted text on same image (pixAutoPhotoinvert)

Open MrChris123 opened this issue 3 years ago • 9 comments

Hi,

I find the results of black text on white background are superb (TIF images at 300dpi). However if the image also contains some white text on a black background then this is never read. Can you offer any suggestions for this please?

I have seen suggestions on other Tesseract articles mention calling Leptonica to perform some pre-processing that would help, but I am unsure if your package allows access to Leptonica.

Regards,

Chris

MrChris123 avatar Jul 02 '21 15:07 MrChris123

A subset of leptonica is exposed via the Pix class. In your case the first thing I'd try is taking the bitmap and inverting it so your back to black on white.

On Sat, 3 Jul 2021, 01:33 MrChris123, @.***> wrote:

Hi,

I find the results of black text on white background are superb (TIF images at 300dpi). However if the image also contains some white text on a black background then this is never read. Can you offer any suggestions for this please?

I have seen suggestions on other Tesseract articles mention calling Leptonica to perform some pre-processing that would help, but I am unsure if your package allows access to Leptonica.

Regards,

Chris

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/charlesw/tesseract/issues/558, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB7HSG2MDUBGBQKHNM2L6TTVXL3BANCNFSM47W6CCQA .

charlesw avatar Jul 02 '21 22:07 charlesw

Thanks Charles,

I'll read up on Leptonica and try to work out how to do this. Does this mean running the whole image through again and then doing a manual combination of the two sets of results, or would I get a single set of results?

BTW, I have read that Tesseract 5 would handle this better. Are there any plans to upgrade your package when this is officially available?

MrChris123 avatar Jul 05 '21 08:07 MrChris123

No, I'd do it as a preprocessing step.

Regarding tesseract 5 it should be supported. Unfortunately I've got my hands full with other priorities these days so can't say how long it'll take after it's released.

On Mon, 5 Jul 2021, 18:13 MrChris123, @.***> wrote:

Thanks Charles,

I'll read up on Leptonica and try to work out how to do this. Does this mean running the whole image through again and then doing a manual combination of the two sets of results, or would I get a single set of results?

BTW, I have read that Tesseract 5 would handle this better. Are there any plans to upgrade your package when this is officially available?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/charlesw/tesseract/issues/558#issuecomment-873903815, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB7HSEF2TKQP35QYVTOWHLTWFSRPANCNFSM47W6CCQA .

charlesw avatar Jul 05 '21 08:07 charlesw

Thanks for such a swift response again Charles.

Looks like I've got a lot of reading to do. I can see there is a Pix.GetData() but that appears to get the whole image data rather than a section. Am I barking up the wrong tree?

MrChris123 avatar Jul 05 '21 12:07 MrChris123

Sorry miss read/understood your issue. I'd say you'll need to first identify the section then invert only that part. Not sure myself how to do that with leptonica though.

On Mon, 5 Jul 2021, 22:02 MrChris123, @.***> wrote:

Thanks for such a swift response again Charles.

Looks like I've got a lot of reading to do. I can see there is a Pix.GetData() but that appears to get the whole image data rather than a section. Am I barking up the wrong tree?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/charlesw/tesseract/issues/558#issuecomment-874057488, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB7HSEFTV3O43NXOUPOXKDTWGNLVANCNFSM47W6CCQA .

charlesw avatar Jul 05 '21 20:07 charlesw

Aah, okay. Someone on a Google group pointed me at https://github.com/DanBloomberg/leptonica/blob/5aaf1c187deeef7f47288c6b0833a07021940da7/src/pageseg.c#L2370-L2391 but I'm not sure how to get Leptonica to call it. I may try to reverse engineer it and so the something similar in C# before I get Tesseract involved. Or wait for Tesseract 5 :-)

MrChris123 avatar Jul 06 '21 08:07 MrChris123

Hi Charles,

Turns out Leptonica includes pixAutoPhotoinvert from v1.79, and your package is using v1.80 :-)

Is there an easy way for you to expose it?

Chris

MrChris123 avatar Jul 08 '21 16:07 MrChris123

Hi, it can most certainly be added. Do you feel game to give it a try?

Basically you'll need to:

  • Add signature to LeptonicaApi
  • Add wrapper function to Pix class
  • Add test case to the test project

On Fri, 9 Jul 2021, 02:28 MrChris123, @.***> wrote:

Hi Charles,

Turns out Leptonica includes pixAutoPhotoinvert from v1.79, and your package is using v1.80 :-)

Is there an easy way for you to expose it?

Chris

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/charlesw/tesseract/issues/558#issuecomment-876578422, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB7HSEMBFKEC3A4TPOAV43TWXGZNANCNFSM47W6CCQA .

charlesw avatar Jul 08 '21 20:07 charlesw

Hi. When I get some downtime I'll pull the code and see if I can build it.

As a slight aside have you tested setting engine variables much as I have tried a few but never noticed any difference in the results?

Chris

MrChris123 avatar Jul 09 '21 08:07 MrChris123