immich icon indicating copy to clipboard operation
immich copied to clipboard

[Feature]: OCR

Open akoyaxd opened this issue 2 years ago • 1 comments

Feature detail

Additionally to object detection it would be awesome to have the images ocr'ed to search for Text inside the images and added to the metadata.

Platform

Server

akoyaxd avatar Sep 02 '22 08:09 akoyaxd

is this something that can be completed by a webhook into an eco-system of ML containers?

Ie, on upload, a webhook is triggered, which is registered by one or more individual ML containers to do their thing, OCR, face detection, object detection. Whatever is actually wanted/needed by the individual.

palitu avatar Sep 13 '22 06:09 palitu

This is nice but out of scope of the project

alextran1502 avatar Dec 23 '22 05:12 alextran1502

I am using PaddleOCR to implement ocr and support retrieval on the app

jasongwq avatar Dec 28 '22 03:12 jasongwq

I am using PaddleOCR to implement ocr and support retrieval on the app

How?

eagle470 avatar Oct 13 '23 20:10 eagle470

Approaching this from a different angle: Google Photos android app saves locally1 a fairly complete (and GB-large2 for any sizeable number of assets) gphotos0.db which is a sqlite3 db with a lot of metadata for (all the) Google Photos assets from the account. There is a lot of data there, including of course the OCRed strings. If we had an endpoint, or a simple no matter how hackish workflow to ingest this into Immich it'll mean a lot for power users coming from Google Photos.

1 albeit you'd generally need root to grab it, or just some Android emulator with enough stuff on it so you can install Google Photos, log in and let it sync the db, and then open the local disk and access it some way
2 this is what you see as GBs taken by Google Photos even if you don't have anything locally, but many pictures online

vb0 avatar May 24 '24 07:05 vb0