Vladimir Starostenkov

Results 3 comments of Vladimir Starostenkov

@deanmalmgren did you manage to publish the workaround?

@deanmalmgren no worries :) What about textract? It pushed me to the idea that one can do "pdf -> image -> tesseract -> text" Which is a kind of neural...

@mooncrater31 The short answer would be "yes, for some documents it really is!" In our project we decided to skip documents where cid chars become dominating. And skip plenty of...