docformer icon indicating copy to clipboard operation
docformer copied to clipboard

[TODO] `create_features` does not detect rotated images leading to no extractions

Open shabie opened this issue 2 years ago • 6 comments

TODO

I specially like this answer with tesserocr (faster than pytesseract): https://stackoverflow.com/a/69131832/7996306

shabie avatar Nov 10 '21 14:11 shabie

Okay, I would try to include that, instead of pytesseract. Just a side note, did you add the DocFormer implementation, in the paperswithcode.com :)

uakarsh avatar Nov 11 '21 13:11 uakarsh

I am trying it, however I face this error https://githubmemory.com/repo/madmaze/pytesseract/issues/368, and I am unable to fix it right now.

uakarsh avatar Nov 19 '21 08:11 uakarsh

I am trying it, however, I face this error https://githubmemory.com/repo/madmaze/pytesseract/issues/368, and I am unable to fix it right now.

I would try to fix it in coming days

uakarsh avatar Nov 19 '21 09:11 uakarsh

I'll try to get on it in the next days.

On Fri, Nov 19, 2021, 10:26 uakarsh @.***> wrote:

I am trying it, however, I face this error https://githubmemory.com/repo/madmaze/pytesseract/issues/368, and I am unable to fix it right now.

I would try to fix it in coming days

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/shabie/docformer/issues/15#issuecomment-973900960, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHI632WG2BW3J4LM4EWZDLLUMYJ5DANCNFSM5HYBDUVQ .

shabie avatar Nov 19 '21 09:11 shabie

So as an update, despite my best attempts I haven't manage to get the OCR any bit faster...

shabie avatar Nov 29 '21 22:11 shabie

So, uptil then, let us try to work with our old conventional method

uakarsh avatar Dec 01 '21 12:12 uakarsh