maps icon indicating copy to clipboard operation
maps copied to clipboard

Move image metadata extraction to own app

Open tacruc opened this issue 3 years ago • 6 comments

Many of our Issues are do to extraction of image metadata. Therefore I would propose to make this part more modular as it is independent form the main purpose and focus of the maps app. Additionally, an independent app makes it easier to change the used implementation to gather the metadata.

What I would an app image_metadata:

  • Which uses the preview generation process to trigger start the extraction process
  • saves the metadata in one big table, or a table layout inspired by the metadata handling of e.g. digikam
  • provides an (DAV) endpoint to query/search for certain images, by location or date

Most methods, which would be required should already exits in our current extraction process or in the https://github.com/gino0631/nextcloud-metadata app.

Related maps Issues:

  • https://github.com/nextcloud/maps/issues/550
  • https://github.com/nextcloud/maps/issues/478
  • https://github.com/nextcloud/maps/issues/468
  • https://github.com/nextcloud/maps/issues/449
  • https://github.com/nextcloud/maps/issues/447
  • https://github.com/nextcloud/maps/issues/443
  • https://github.com/nextcloud/maps/issues/274
  • https://github.com/nextcloud/maps/issues/270
  • https://github.com/nextcloud/maps/issues/231
  • https://github.com/nextcloud/maps/issues/225
  • https://github.com/nextcloud/maps/issues/206
  • https://github.com/nextcloud/maps/issues/184
  • https://github.com/nextcloud/maps/issues/161
  • https://github.com/nextcloud/maps/issues/143
  • https://github.com/nextcloud/maps/issues/123
  • https://github.com/nextcloud/maps/issues/527
  • https://github.com/nextcloud/maps/issues/525
  • https://github.com/nextcloud/maps/issues/516
  • https://github.com/nextcloud/maps/issues/547

External Issues, where this app might be helpful:

  • https://github.com/nextcloud/server/issues/20839
  • https://github.com/nextcloud/android/issues/7245
  • https://github.com/nextcloud/photos/issues/87
  • https://github.com/nextcloud/photos/issues/445

Any PR is welcome, as I decide to first focus on the outstanding PR's on the core features of maps.

tacruc avatar Mar 04 '21 13:03 tacruc

It seems like there already was an attempt for this: https://help.nextcloud.com/t/mediametadata-app-to-extract-and-store-meta-data-from-media-files/1601

Also, there is this app: https://github.com/gino0631/nextcloud-metadata Maybe it will be possible to talk to the maintainer, if he is interested in opening up his efforts to other apps? see: https://github.com/gino0631/nextcloud-metadata/issues/69

UweKrause avatar Mar 09 '21 22:03 UweKrause

I have had a look at nextcloud-metadata already and most methods to extract metadata are quite similar. The problem is not really extracting the metadata of a single file. This can be done on request and thats what for my knowledge nextcloud metadata is doing.

But this is not sufficient to ask for all pictures, which have geodata. Therefore the metadata have to get extracted before the request is made and stored in some sorted way. As the metadata have to get extracted before the request is started, it has to be done in some kind of background job. But this is the point where the question and problems start:

  • On Background job for all pictures, or on backgroundjob per picture

    • On backgroundjob per picture: Memory leaks in the used libraries accumulated and at some point the process crashes. Addtionally the libaries tend to crash on broken images. All the remaining pictures are not scanned and missing on the map
    • On Backgroundjob per picture. On the initial scan millions of backgroundjobs might be created, such that other backgroundjobs are far behind in the loop and are only executed after a long time. Might get worst if cron is not configured correctly.
    • Additionally Crashing backgroundjobs delay for the intervall cron is executed.
  • Which process schedules and create the background jobs

    • For now we have two ways to create backgroundjobs
    • File change events
      • Unfortunately these events so far seam to miss file share events and changes in groupfolders and external stroages and maybe even more, I lost the overview what all is not working.
    • maps:scan-photos
      • Creates one backgroundjob for each picture, which leads again to problems, if users run this command in a cron job or executed it regularly. Scanning multiple terabyte of pictures will take a while. During this time no other backgroundjob is executed.

As far as I remember any issue with pictures not shown on the map, is related to on of the problems above. Summarizing: for the maps-app the hard part is to execute the extraction once and only once* for all pictures in a reliable way and not the extraction of the metadata itself. *) As long as the picture is not changed.

tacruc avatar Mar 10 '21 09:03 tacruc

the preview generation app has a background job that generates previews incrementally. It somehow keeps track of new/changed files since the last execution. Maybe the background job for pictures/tracks could do something similar?

Galbar avatar Apr 06 '21 10:04 Galbar

One Idea of mine, was to just the preview generation process and create a preview provider, which extracts the metadata. It's a little hacky but it might be worth to investigate in this direction.

tacruc avatar Apr 06 '21 10:04 tacruc

How about to improve existing command? E.g. Face Recognition has timeout option occ face:background_job [-t|--timeout TIMEOUT] that usually set lower than cron job execution period. In this case it is easy to control background job behavior by admin.

GAS85 avatar Oct 01 '21 22:10 GAS85

More related maps Issues: #645 and #655

killi199 avatar Jan 03 '22 00:01 killi199

Apparently there is some kind of metadata feature currently added to the DAV Api. Therefore this can be closed.

tacruc avatar Mar 03 '23 23:03 tacruc

That sounds great! Any issues/PRs you can point us to so that we can follow?

pklampros avatar Mar 04 '23 01:03 pklampros

Yes for gps extraction: https://github.com/nextcloud/server/pull/33511 Documentation: https://github.com/nextcloud/documentation/issues/9659

Improvements Ideas and current limitations: https://github.com/nextcloud/server/issues/36809

But there is even more just search for metadata in the server Pull requests

tacruc avatar Mar 04 '23 06:03 tacruc