facerecognition icon indicating copy to clipboard operation
facerecognition copied to clipboard

Modularization

Open muebau opened this issue 5 years ago • 15 comments

Hi, I planed to build a system/app like this and I am very happy to see this great app.

When I planed my app it was less coupled with Nextcloud as I am more in the detection part than in php. Nevertheless I could be great to separate the different parts anyway:

Different Parts for:

  • Detection If this would be a simple part with a well defined interface (eg. REST API) it would be very easy to "connect" it to the App by simply give it a endpoint URL. The work would be done in this "detection system" and could be on the same system (localhost) or on a more powerful "offloader" system. The use of "AI Sticks" like the Google Coral would be quite easy as it would only be an addition to the detection system and does not necessary interfere with the Nextcloud system.

  • Clustering This could be a different system with a REST API too. It could eg. use lots of RAM on a powerful machine.

  • Different Models Once the app is split in different parts there is no need to do "just" face detection alone. There could be a wide range of models for object detection or even face expression detection etc. The size of the app would be very small as the heavy part would be in the different detection systems. There would be no need to explain why the app does not work without an installation of some additional software (CLI etc.) as the REST APIs show the need of a additional detection part.

I see many advantages in this modularization approach:

  • simple offload to systems with GPU, TPU, compute sticks, more RAM, etc.

  • easier to see and understand the different parts

  • very extensible (eg. support for generic Tensorflow Lite Models for all kind of Attributes)

  • could be done in a batch procession (every night) fashion (the big machine would wake up for this)

  • could be installed on a RPi anyway and used with "localhost"

  • could be realized with simple docker containers including ready to use containers for GPU, TPU, compute sticks, etc.

  • could lead to a generic wrapper with REST API for AI stuff which would abstract the internal models (like Core ML, TF Lite)


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

muebau avatar Jan 03 '20 10:01 muebau

Hi @muebau It's something I've also been thinking about a lot.

The application has been thinking about being able to handle several models, but today we prefer to focus on one. We select dlib models and so far we are happy. Another model will be HOG also using dlib/PDlib. But these were always thought within the same application.

Supporting external services may be questionable due of the needed to send the images to another server, and lose the security of our data and all the magic of Nextcloud. Believe me there are many paranoid people who think like that. haha.. 😅

That said, I personally do not dislike the proposal, whenever optional. But it will take a good time. First we must modulate some code, and really support several models.

As for the services you propose, at this stage I can accept the Detention.

About clustering, although it (Now chinese whispers clustering) is also under discussion, I do not expect to change it, and although it must be optimized, I prefer it to remain local.. Doing this part of an external service can complicate the application a lot.

About Different Models, I think this would be the biggest advantage, but always focusing on the faces. However, I consider the external model as an individual model. I imagine something like collabora code. and that the service will be configured beyond the nextcloud instance.

That said, I can accept a new model/service where to send an image (a PUT method) and return a json with faces, confidence and descriptor. But I tell you again, that we currently have no way of handling it correctly.

@stalker314314 If you want to comment.

About object detection I IMHO think it should be another application. 😉

matiasdelellis avatar Jan 03 '20 13:01 matiasdelellis

Hi after some research I found ready to use containers with a well defined API. Like https://github.com/robmarkcole/HASS-Deepstack-face I like the approach of this even more (same API but without the token stuff). https://github.com/robmarkcole/coral-pi-rest-server

If there would be an option to use this type of API there could be a manual to start local containers (GPU or TPU or CPU etc.).

It could be even an option to start a container locally.

muebau avatar Jan 14 '20 17:01 muebau

Great @muebau :grimacing:

They are interesting, but remember that today this will have low priority .. :wink:

Thanks again...

matiasdelellis avatar Jan 20 '20 11:01 matiasdelellis

Hi, I just want to add couple of notes. FaceRecognition app today have ability (in theory) to have multiple models (albeit we have only 1 today). It is designed from day 1 for that. However, one must understand that we have 3 different and somewhat coupled parts in this "model" (as correctly identified by @muebau in first post):

  1. Face detection
  2. Face landmark (getting face thumbprint)
  3. Face/landmark clustering to persons

In model 1 (the only one), we use dlib to achieve all 3, but with completely different API calls. To create new model, one that @muebau proposes, it will have to either have all 3 parts or to somehow plug into existing "ecosystem". For example, I can see how it is possible to use docker and remote API calls to give all landmarks and get back all persons (step 3). However, I fail to understand how https://github.com/robmarkcole/HASS-Deepstack-face or https://github.com/robmarkcole/coral-pi-rest-server can be plugged in. They just do step 1 and we still have to use dlib for step 2 and 3. So, it is possible, but it is not easier for user (as deps still need to be installed). So, in a way, we either have to:

  • split current model framework to support three different models, where user can pick each step independently (not sure how good this approach is for end user), or
  • offer new model number 2, which is not just one of these APIs, but full set of all 3, let's call it "Muebau's container" and if user have that "Muebau's container", he can select model 2 and use it. Ideally, in that case, you will not need dlib at all and no hardware requirements.

I guess latter could be option for FaceRecognition, but I still think it needs to be complete package!

Regarding online/remote APIs to calculate data and upload your private photos "somewhere", I will not comment. I don't like that, and FaceRecognition app already had some heat on forum and face recognition is generally hated, but I think for wrong reasons, but still - who am I to forbid it, especially if this is not default choice and user is clearly presented with warning if he switch to this model.

One final note - before @muebau jump to work on "Muebau container":), I would still like to create other model, as described in #93, and when we polish model choosing and switching, then it might be better position to talk about model number 3:)

stalker314314 avatar Jan 20 '20 21:01 stalker314314

Hi @muebau In the last PR, we made a great modularization of the code, adding 3 models CNN5Landmakrs, CNN68Landmakrs and HOG.

An interface is defined that must be met to implement the new models. Of course this is defined according to our needs, but I think it is generic enough to export it to other models.

About the features you requested, the clustering is still done within the main application, and is unlikely to change. As you say it consumes a lot of memory, but it can be optimized, and in any case it would be a lot of information to exchange with a service that would be much more inefficient.

Regarding online/remote APIs to calculate data and upload your private photos "somewhere", I will not comment. I don't like that, and FaceRecognition app already had some heat on forum and face recognition is generally hated, but I think for wrong reasons, but still - who am I to forbid it, especially if this is not default choice and user is clearly presented with warning if he switch to this model.

I clarify that I completely agree with @stalker314314 .. and for the only reason to accept an external model it would be because I could take advantage of an Nvidia with CUDA from some friend. On the other hand, when the user enable to analyze their photos, we could add a confirmation with "Terms of Service." for external models.

Clarified this, if someone wants to make this model, welcome :wink:

matiasdelellis avatar Mar 12 '20 13:03 matiasdelellis

Hi @muebau .... Clarified this, if someone wants to make this model, welcome

Hi, I started some time ago with some implementation based on JavaScript (tfjs-core) and some based on Google Coral.

The JavaScript version is implemented as a Nextcloud App to provide a solution to use the computing power (JavaScript => Browser => WebWorker => GPU) of the user/users. This makes sure the privacy concerns are meet if needed.

Unfortunately due to private circumstances there was little progress in the last weeks and there will be little progress in the coming weeks.

muebau avatar Mar 14 '20 13:03 muebau

I started some time ago with some implementation based on JavaScript (tfjs-core) and some based on Google Coral.

Wow, Although I understand that this would be another application, I would love to see the progress .... :smiley:

Unfortunately due to private circumstances there was little progress in the last weeks and there will be little progress in the coming weeks.

Of course, there are always priorities, and I hope everything turns out well. :wink:

Regards

matiasdelellis avatar Mar 14 '20 14:03 matiasdelellis

https://github.com/matiasdelellis/facerecognition-external-model :see_no_evil: :shushing_face:

matiasdelellis avatar Nov 22 '20 19:11 matiasdelellis

Well, After almost a year, it took only a couple of hours to have an external model.. hahaha :sweat_smile:

But this thanks to a lot of previous background work inspired by this report. :wink:

It is not what you wanted, but it is the best I can offer for now...

matiasdelellis avatar Nov 23 '20 00:11 matiasdelellis

https://github.com/matiasdelellis/facerecognition-external-model see_no_evil shushing_face

whoooohooo there it is.

Does it do its work asynchronously? I think it sends a request and gets the response directly, right? I hope to find the time to test it with the Coral USB Accelerator especially. First I have to find some time to setup it anyway. 😂

muebau avatar Nov 28 '20 10:11 muebau

Hi @muebau

Does it do its work asynchronously? I think it sends a request and gets the response directly, right?

I'm not sure I understand you... It is a POST request, and the response is all the data of the faces obtained... So, it would be synchronous..

The only important requirement is that the model must answer the encoding comparable with L2 distance, since as I said at the time, the clustering will continue to be local.

I hope to find the time to test it with the Coral USB Accelerator especially. First I have to find some time to setup it anyway. joy

You will have to adapt the reference model to use tensorflow. Some ideas... :wink:

  • https://www.pyimagesearch.com/2019/04/22/getting-started-with-google-corals-tpu-usb-accelerator/
  • https://github.com/goruck/edge-tpu-servers

I would love to see your tests. :grimacing: :

matiasdelellis avatar Nov 28 '20 15:11 matiasdelellis

+1 to modularization along the lines of Collabora, but for a different reason: I'm relying on a Nextcloud instance, where I can't install additional dependencies. Using your face recognition tool would be awesome for us, so if you could find some kind of system where the actual face recognition code doesn't have to run on the same machine as the Nextcloud instance, that'd be awesome.

burnoutberni avatar Dec 21 '20 18:12 burnoutberni

Hi @burnoutberni Excuse me, but it is highly unlikely that there is a 100% external version.

I'm relying on a Nextcloud instance, where I can't install additional dependencies

Just out of curiosity .. what do you use? Honestly, if you want to continue adding more features, you can never depend on such a closed service.

matiasdelellis avatar Dec 22 '20 22:12 matiasdelellis

Hi Matias, thanks for your response! Of course that's sad for me, but I completely understand, so fair enough.

Just out of curiosity .. what do you use?

This. It's reliable and really cheap and it's been in use since before I joined the organisation where I'm now kinda responsible for it.

Honestly, if you want to continue adding more features, you can never depend on such a closed service.

Yeah, I understand it's a huge trade-off, but we don't actually want to "continue adding more features", but rather have a reliable shared file storage. Everything else is just a cherry on top. I was excited to see your project, since I've been thinking about pitching the idea to try some software (specifically Photoprism) in order to better organize our photo database and being able to detect and tag faces is really the quintessential feature for us (plus Nextcloud integration would also be cool). OTOH I don't think we're gonna migrate our Nextcloud just for this, so sadly this is a deal breaker for us. I'll continue following your project though, maybe things do change for us at some point. Best of luck!

burnoutberni avatar Dec 23 '20 17:12 burnoutberni

Hi all,

I'm not a programmer or an IT specialist so first sorry if my suggestion results out of context. Looking at modularization may I suggest Compreface? (https://github.com/exadel-inc/CompreFace). It's open and free. I use it besides deepstack (from robmarkcole) for my frigate integration for person and face recognition on my local CCTV system.

jokerigno avatar Jun 23 '21 08:06 jokerigno

Well, I think that today it is sufficiently modularized... I keep inviting to create new models using the external model as a reference, and based on this I can add new comparison distances, and even clustering algorithms.

Thanks for everything!

matiasdelellis avatar Aug 24 '23 00:08 matiasdelellis