OCRmyPDF icon indicating copy to clipboard operation
OCRmyPDF copied to clipboard

[Feature]: Feature Request - Use Google Document AI or VIsion AI instead of Tesseract

Open epatels opened this issue 1 year ago • 2 comments

Describe the proposed feature

Hi,

I know Tesseract OCR engine is free. But unfortunately is not very good especially while performing OCR for Indian Languages.

This is where Google Document AI and Google VIsion AI excels. I understand there is a cost involved in using these services.

But I am looking for a solutions that performs underlying OCR process using Google Document AI or Google VIsion AI OCR engine. The rest can remain unmodified with the output being a searchable PDF (in Indian Languages).

epatels avatar Nov 18 '24 12:11 epatels

I have forked and updated kkrell2016’s pre-existing Google Vision OCRmyPDF plugin and am happy to report that it works well. See here: https://github.com/grantbarrett/son-of-ocrmypdf_plugin_GoogleVision. I recognize that by using a paid service like Google it goes against the spirit of open source, but the Google Vision OCR is very, very good so the compromise is worth it to me.

grantbarrett avatar Apr 24 '25 04:04 grantbarrett

Oh, wow. That's really great. Can't wait to test asap.

Thanks and Regards, Chandrakant Patel Mumbai

On Thu, 24 Apr 2025 at 09:35, Grant Barrett @.***> wrote:

grantbarrett left a comment (ocrmypdf/OCRmyPDF#1434) https://github.com/ocrmypdf/OCRmyPDF/issues/1434#issuecomment-2826310699

I have forked and updated a pre-existing Google Vision OCRmyPDF plugin and am happy to report that it now works well. See here: https://github.com/grantbarrett/son-of-ocrmypdf_plugin_GoogleVision. I recognize that by using a paid service like Google it goes against the spirit of open source, but the Google Vision OCR is very, very good so the compromise is worth it to me.

— Reply to this email directly, view it on GitHub https://github.com/ocrmypdf/OCRmyPDF/issues/1434#issuecomment-2826310699, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC2QXLR6B3FZU4HFSLVDEYL23BPHNAVCNFSM6AAAAABR7QY3CCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMRWGMYTANRZHE . You are receiving this because you authored the thread.Message ID: @.***>

epatels avatar Apr 24 '25 05:04 epatels