immich icon indicating copy to clipboard operation
immich copied to clipboard

Implement new Face Recognition Module

Open LackesLab opened this issue 2 years ago • 8 comments

Dear Contributors, I like to implement a new module handling the face detection and recognition for Immich. Please find my proposal as first part in the changed README and give me feedback. Please let me know if you have any further questions.

Best Regards Lukas

LackesLab avatar Sep 08 '22 15:09 LackesLab

This proposal sounds great to me and something I am sure we would all love to see added into Immich!

zackpollard avatar Sep 08 '22 15:09 zackpollard

Immich already has a DB in the stack. Wouldn't you rather want to plug into that one instead of setting up a new one?

bertmelis avatar Sep 08 '22 16:09 bertmelis

As discussed on discord, I'll first go with a file based datastore.

LackesLab avatar Sep 08 '22 21:09 LackesLab

I'd gladly see ML based face recognition in Immich! To find similar face embeddings quickly (to classify a new face), use an approximate nearest neighbor library like hnswlib.

fyfrey avatar Sep 09 '22 07:09 fyfrey

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated
immich-code-coverage ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Mar 25, 2023 at 3:47PM (UTC)
1 Ignored Deployment
Name Status Preview Comments Updated
immich ⬜️ Ignored (Inspect) Visit Preview Mar 25, 2023 at 3:47PM (UTC)

vercel[bot] avatar Nov 11 '22 08:11 vercel[bot]

@LackesLab is attempting to deploy a commit to the immich Team on Vercel.

A member of the Team first needs to authorize it.

vercel[bot] avatar Jan 20 '23 19:01 vercel[bot]

The workflow looks good to me. Thanks for getting started on this @LackesLab.

There's a good chance that the algorithm will occasionally detect the same person as different faces and assign them different IDs. Can we include a workflow for merging two face IDs into one?

This would probably also be the same workflow for if a photo is misidentified as a different pre-existing face ID.

mike-lloyd03 avatar Jan 22 '23 03:01 mike-lloyd03

Can we include a workflow for merging two face IDs into one?

The plan is to apply tags to images listing things like which faces are in them, as well as other information like recognized objects or user-created tags. That functionality is intended to then include support for merging tags as well.

bo0tzz avatar Jan 22 '23 12:01 bo0tzz

Any chance that we could get Edge TPU (e.g. Coral USB accelerator) support for face recognition baked in? I've got a photo lib with something like 60k photos, but the 1U server I host it off of has a fairly anemic EPYC CPU with no GPU installed.

Having tried to import my lib into both LibrePhotos and PhotoPrism (both of which have facial detection, but neither of which has TPU support) has melted my server down a few times =D I already have a Coral TPU installed (for Frigate), so I would be happen to test.

eschwim avatar Mar 03 '23 19:03 eschwim

I would love to test it but I don't have a hardware at the moment

alextran1502 avatar Mar 04 '23 01:03 alextran1502

I looked into TPU support a little bit. The problem is that Coral TPUs are Tensorflow devices, while we're using pytorch, so there would have to be a step that transpiles the models. That would complicate things quite a bit, I think.

bo0tzz avatar Mar 04 '23 09:03 bo0tzz

The Coral TPU is not very capable of powerful machine learning model. 8bit models are typically outperformed by their greater counterparts.

I tested the model inference speed on a MacBook Pro M1 and on a i5 8400.

System Model Accuracy LFW Speed
i5 8400 buffalo_sc 99.70% 7,8 img/s
i5 8400 buffalo_l 99.83% 1,83 img/s
M1 Pro buffalo_sc 99.70% ~ 11 img/s
M1 Pro buffalo_l 99.83% 3,83 img/s

For the i5-8400, you can scan up to 2808 images per hour. Thats pretty descent for such a model run on a cpu.

For comparison: Nextcloud Photos 2.0 uses a facial recognition system achieving 99.38% Accuracy on the LFW Dataset. The difference seems to be small for non experts, but 0.5% percent points are huge here.

I found an implementation using a edge tpu compatible neural network. It also only reaches 99.3 % like the dlib model which nextcloud uses reference: https://github.com/zye1996/Mobilefacenet-TF2-coral_tpu

LackesLab avatar Mar 13 '23 20:03 LackesLab

I would love to test it but I don't have a hardware at the moment

@alextran1502 they are in stock now on seeedstudio! I got two today after waiting for several months https://www.seeedstudio.com/Coral-USB-Accelerator-p-2899.html

jagjordi avatar Mar 31 '23 18:03 jagjordi

I am also available to test edge TPU related releases

jagjordi avatar Mar 31 '23 18:03 jagjordi

Just a heads up, before people start buying Corals for use with Immich: Our ML system is currently using pytorch, and the Coral is only compatible with Tensorflow.

bo0tzz avatar Mar 31 '23 18:03 bo0tzz

Most people in the self-hosted community have a Coral for use with frigate, so I would say that it would be good eventually if the ML system is ported to Tensorflow (coral actually only supports tensorflow lite)

jagjordi avatar Mar 31 '23 19:03 jagjordi

I too have a Coral for Frigate and would love to use it for more services. But now that I think about it, I'm not sure how big of a difference it will make with Immich. It'll probably make object detection run faster on initial import of a large batch of images. But as I upload 5 or 6 images a day, I doubt the CPU load will be all that significant.

mike-lloyd03 avatar Mar 31 '23 20:03 mike-lloyd03

I agree that daily normal usage is not very big but the initial import is a significant part of the user experience of the system. I have around 300k images in my library and with the current setup I'm getting less than 1k images per hour of performance of tagging. That means it would take something like 4 months to run the initial tagging. And that's with the CPU at 100% for 4 months! Imagine the power consumption!

In short, I would not dismiss this feature just because "it's only useful for the initial import" sins for many users the initial import is a big part of the system.

jagjordi avatar Mar 31 '23 22:03 jagjordi

I believe SBERT, the technology for CLIP search uses Pytorch, that is why we are having Pytorch as dependency.

alextran1502 avatar Mar 31 '23 22:03 alextran1502

Face recognition would be awesome!!! Is this something I can help with by testing it? If so, how can I test :)

normanu avatar Apr 14 '23 12:04 normanu

@LackesLab has finished the initial python implementation for facial recognition. However, there is still a lot to do with regards to clustering, labeling, merging, and other face/person related management tasks, which we're planning on building into immich itself. For example,

  • Faces metadata to be saved in asset.smartInfo
  • Face embeddings to be to pushed to typesense, a vector database, to enable similarity distance queries
  • Auto-link/associate faces to people
  • Track people, with abilities to label, merge, manage, link to a user, etc.
  • Generate and serve optimized face images (ideally from the original images, not the resized thumbnail)
  • Integrate faces into the web/mobile app
  • Search by people/faces
  • New admin job/queue for tracking and regenerating face embeddings
  • etc.

To make it easier for the core immich dev team to help contribute to the remaining tasks, we've recreated this PR in the immich repo itself as #2180.

jrasm91 avatar Apr 14 '23 12:04 jrasm91