lespas icon indicating copy to clipboard operation
lespas copied to clipboard

Search for objects - persons

Open Schdefoon opened this issue 2 years ago • 8 comments

I just could not find in the documentation why search for persons is deactivated. (see app screenshot in README.md) Is it not implemented? Or are there things that have to be prepared by the user to activate it?

Schdefoon avatar Dec 10 '22 11:12 Schdefoon

The current best pre-trained tensorflow model I can find still has false negative rate that is too high to be use in production.

scubajeff avatar Dec 10 '22 11:12 scubajeff

I use the recognize app from Nextcloud, which is featured by Nextcloud now. I guess the recognize app does not store the data in the pictures itself, but in a database. But do you know if it would be possible to access this database to get the data for persons and objects and so on?

The advantage would be to let the server do the heavy work, which might have a good GPU, I guess. Moreover, this enables interoperability since all data is located within Nextcloud. Or do you already store data back in Nextcloud?

frederikb96 avatar Mar 13 '23 21:03 frederikb96

Les Pas use pre-trained tensorflow model for object detection (and also face recognition too, I just don't feel confident releasing this feature), I believe those Nextcloud apps also pretty much use the same. These models tend to be conservative on drawing inferences, e.g., all have higher false negative rate, result in tedious works of managing the inference results by hand, and even worst, it can't be fixed (re-train) because the model is closed. For self-hosted cases, we deal with small set of objects, small sets of people, this continual work makes AI searching seems particular dumb.

In short, I don't think the current closed pre-trained models can ever be ideal for prime time. So those features in Les Pas can be described as better to have, and that's why it doesn't bother saving the result on server, however, cached in phone for the sake of speed.

I haven't spend time with Nextcloud's Recognize app, maybe they save the result in DB, or maybe in Tags. Anyway, no API provided yet to tap into it.

scubajeff avatar Mar 14 '23 00:03 scubajeff

Ok, thank you for the clarification, makes sense.

frederikb96 avatar Mar 14 '23 20:03 frederikb96

Hello!

Recognize can be tapped into just fine, as Memories (web-app) integrates with it in NC 26 just fine. You can browse people, places, landmarks, tags, etc. For people you can create people, merge them, split them, move wrong tags in and out of person, and such. Sure, there's lots of work still, but even current results are awesome. Being able to simply sync that data as an album to phone app would be great.

luxzg avatar May 18 '23 05:05 luxzg

Not sure if Memories saves the inference result in it's own DB or exposes them as tags. I might need to check out which model Memories is using too.

scubajeff avatar May 18 '23 08:05 scubajeff

I'm not a dev, but from short experience with Nextcloud + Recognize + Memories, here's what I know.

Recognize uses similar system as you've described. Results are stored in 2 separate database tables, one which contains file ID, plus a reference to a person, and position inside the image file. Other table contains people, which are basically groups of most probable hits (or clusters, as they call them).

Github page: https://github.com/nextcloud/recognize

Memories is photo gallery, so in a way Memories in web interface is what your app is in the Android app form. ( https://github.com/pulsejet/memories/ ) Memories app pulls data not from the database, but via Recognize API's. Whatever you do in Memories, in the background is forwarded to Recognize to execute. This includes moving some image to or from a group (person), creating new person, renaming (or adding a name) and so on.

You can use the included WebDAV endpoint, which is mentioned here: https://github.com/nextcloud/recognize/wiki/Behind-the-scenes

There's a pretty good explanation of how the system works here: https://help.nextcloud.com/t/ai-and-photos-2-0-in-depth-explanation-of-nextcloud-recognize-and-how-it-works/146767

Unfortunately, I did not see the list of endpoints anywhere. But I've made a quick search in Memories repo, and I can see plenty of recognize paths in code there ( https://github.com/search?q=repo%3Apulsejet%2Fmemories%20%20recognize&type=code )

Hopefully that makes it easier to get started with this topic :)

luxzg avatar May 18 '23 17:05 luxzg

Recognize webdav endpoint looks very interesting. I love the idea of apps cooperation. Will look into it asap.

scubajeff avatar May 19 '23 00:05 scubajeff