mlkit icon indicating copy to clipboard operation
mlkit copied to clipboard

document scanner rotates the image when it recognizes the document

Open Tavorc opened this issue 10 months ago • 9 comments

When i'm using the ML kit for document scanner, most of the time(like 95%), the document that's recognized by the library it rotates the image. it doesn't matter if i'm using the automatic mode or manual

any idea how to solve it? Does it happen to someone? Screenshot_20240326_095202_Google Play services

Tavorc avatar Mar 26 '24 07:03 Tavorc

Does your unwanted rotation happen with LTR scripted docs? Are all of your docs receipts?

I suspect that MLKit's document scanner "UI flow" may be slightly tied with Text Recognition API, which in turn does not support Hebrew or any other RTL scripts. Even if it's irrelevant..

As you may have noticed in real life people who don't know Hebrew are trying to read documents written in Hebrew upside down. Idk about other rtl scripts irl, but I guess the fact that almost all Hebrew letters have the same height does not help at all.

These are just thots, I am not affiliated with Google in any way. I believe it's indeed a bug since API seems to be designed text-agnostic. Especially in manual mode.

If all of your data are receipts of similar format, probably you can postprocess them on low level or with tesseract

listvin avatar Apr 03 '24 03:04 listvin

Inspired by:

https://github.com/googlesamples/mlkit/issues/784#issuecomment-2004640133

It seems that your Android language is English, can you try switching it to Hebrew?

listvin avatar Apr 03 '24 03:04 listvin

first of all thank you. Yes, all of the docs are receipts, it's fintech app. I tried to change the language to Hebrew, doesn't work.

there is openCV library that i can use to cropping an image, but i didn't want it because the ML kit is more innovative.

Tavorc avatar Apr 03 '24 08:04 Tavorc

Thanks for the feedback.

There is an auto-rotation step in the scanning flow. The intention is that when you hold the phone in parallel to the table, it may trigger the phone's and camera's auto rotation logic, and results in taking images with wrong orientation. However, apparently that text-based model doesn't work very well in this case.

What do you think would be a better behavior for you? Ideally, the model just handles everything. But if not the case, an option to turn on/off auto-rotation OR something else in your mind?

ai-plays avatar Apr 05 '24 21:04 ai-plays

I think you can know what is the orientation of the device, for example in the camera there is label 1x that represent the zoom, when i rotate the device the "1X" will rotate also, so probably you can use this.

Tavorc avatar Apr 07 '24 07:04 Tavorc

any update on that ? we're also experiencing the same issue with Flutter library that uses MLKit, when trying to scan documents in Hebrew:

https://github.com/jachzen/cunning_document_scanner/issues/74

It could be great if it was possible to disable auto rotate... :)

tamirla avatar Sep 29 '24 11:09 tamirla

Thanks for flagging this issue. We have been working on this and improving our rotation classifier.

@Tavorc @tamirla - is this mostly happening with Hebrew documents or you have faced the issue with english documents as well?

Would it be possible to share some full image examples (pre-crop, taking form stock camera) where this issues is clearly reproducible (try import from gallery option).

mebjas avatar Oct 01 '24 02:10 mebjas

@mebjas Thanks for this update, so far we saw it only for Hebrew documents and only in Android, see attached image as an example

pic1 pic

tamirla avatar Oct 02 '24 12:10 tamirla