tesseract
tesseract copied to clipboard
for .net core tesseract is not able to read the text which is inside the box
Hi charles with the help of your .net code samples i used that to in .net core for creating a searchable pdf. It works fine when i use a image with normal form of text but when i use a image which have rectangle boxes and the text is inside the box, its not able to detect that text inside box. what would you recommend.
thanks for the help in advance
Once upon a time I did a PR for line removal. It helps with that situation.
@Abhijeet501 The PR tdhintz is referring to is https://github.com/charlesw/tesseract/pull/369 and should help your situation.
I'm a little hesitant to add it to the wrapper itself as I'd like to keep the wrapper as simple as possible to minimize support requirements due to my own time constraints. This would be a good (3rd party) extension though.