surya icon indicating copy to clipboard operation
surya copied to clipboard

Issue parsing inverted (white on black) text

Open nhoffman opened this issue 9 months ago • 2 comments

Hi there - I am looking into parsing laboratory test results (unfortunately results are often received as pdfs), and performance seems to be great except in a very specific context: a report that I'm looking at contains a critical element with white text on a black background. In this case the text is either not detected or read incorrectly. I'm a bit limited in what I can share so this is lacking context, but for example, failure to detect text:

image

Incorrect results:

image image

Any suggestions on settings or pre-processing strategies that might help?

Thanks a lot!

nhoffman avatar May 24 '24 18:05 nhoffman