tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

for .net core tesseract is not able to read the text which is inside the box

Open Abhijeet501 opened this issue 5 years ago • 2 comments

Hi charles with the help of your .net code samples i used that to in .net core for creating a searchable pdf. It works fine when i use a image with normal form of text but when i use a image which have rectangle boxes and the text is inside the box, its not able to detect that text inside box. what would you recommend.

thanks for the help in advance

Abhijeet501 avatar Aug 07 '19 21:08 Abhijeet501

Once upon a time I did a PR for line removal. It helps with that situation.

tdhintz avatar Aug 07 '19 21:08 tdhintz

@Abhijeet501 The PR tdhintz is referring to is https://github.com/charlesw/tesseract/pull/369 and should help your situation.

I'm a little hesitant to add it to the wrapper itself as I'd like to keep the wrapper as simple as possible to minimize support requirements due to my own time constraints. This would be a good (3rd party) extension though.

charlesw avatar Oct 12 '19 08:10 charlesw