tesserocr icon indicating copy to clipboard operation
tesserocr copied to clipboard

bug in GetComponentImages

Open zdenop opened this issue 4 years ago • 6 comments

Example code does not work, because there is bug in this code (copy&paste of wrong code?):

https://github.com/sirfz/tesserocr/blob/6d9d00ff319a3181fb9e5ab79897b6d169432a19/tesserocr.pyx#L1940-L1953

UPDATE: It looks like problem is related to pixa_to_list. If I replace it with:

            pixa_list = pixa_to_list(pixa)
            return pixa_list

I get no output (actually it crash). But If I use:

            boxa_list = boxa_to_list(boxa)
            return boxa_list

I get list of boxes...

PS: I am trying to use tesserocr on Windows 10 64bit with tesseract 4.1.1

zdenop avatar Feb 20 '21 18:02 zdenop

Hi @zdenop,

Thanks for your input, pixa_to_list calls boxa_to_list and zips the results. Not sure why it wouldn't work, I'll look into it.

sirfz avatar Feb 25 '21 17:02 sirfz

Turning off error information could be dangerous. https://github.com/sirfz/tesserocr/blob/0d24cc4849289b1cdba96c694e1327cd7da95947/tesserocr.pyx#L52-L55 Maybe option to turn them on would be help.

Anyway this problem will be solved with PR #252

zdenop avatar Mar 27 '21 15:03 zdenop

Maybe option to turn them on would be help.

You can get them on by just doing SetVariable('debug_file', '') after constructing your own TessBaseAPI instance. (So yes, this prevents seeing errors that happen during __cinit__.)

bertsky avatar Jul 02 '21 18:07 bertsky

Anyway this problem will be solved with PR #252

So @zdenop was the problem really just that your Leptonica was not compiled against the default libpng?

bertsky avatar Jul 02 '21 18:07 bertsky

Yes - problem was missing (linking to) libpng. It was "invisible" as error messages are turned off: https://github.com/sirfz/tesserocr/blob/711cbab544dbb4bd3dcf1f13aad9d0fef20fcac7/tesserocr.pyx#L53-L55

IMO there should be option to turn them on (with parameter) for debugging purposes.

zdenop avatar Jul 08 '21 08:07 zdenop

IMO there should be option to turn them on (with parameter) for debugging purposes.

Ah, so I guess you do not mean Tesseract's debug_file parameter (which you can turn on, see above), but Leptonica's setMsgSeverity, right?

I even think this should be L_SEVERITY_ERROR instead of _NONE by default.

But there's also _EXTERNAL, which lets the user control this dynamically via environment variable LEPT_MSG_SEVERITY (which we could default to 5 or 6 in tesserocr.pyx)...

bertsky avatar Jul 08 '21 10:07 bertsky