Store the original sha256 into EXIF when importing
I'm greatly enjoying all the feature that elodie has to offer. Especially the deduplication functionality. However, I realise that the only place the original sha256 is kept is in hash.json file. This becomes especially complicated because elodie (by design) will update the EXIF information of the images as they're imported, thus changing the sha256. If the hash.json becomes lost or corrupt, I have no way of rebuilding this file. This means that if I have any duplicate photos in the future... they will actually be imported.
Would it make sense, that during the import process, in addition to import the original file name to also include the original sha256?
This would permit the hash.json to be rebuild by scanning all the imported files. Plus it also keeps each all the metadata contained within the image itself (which was one of the greatest feature of elodie).
Thank you for creating such a great application.
Hi @rkr-kununu, You can regenerate the hash.json by running the following command. elodie.py generate-db --source=/path/to/your/library.
This is viewable in the Readme as well. https://github.com/jmathai/elodie#regenerate-checksum-database
Thanks for the response, but I think you're missing my point.
Since Elodie modifies the Exif information, it will also cause a new sha256 to be generated when generate-db is ran. This means if import an image, add a geolocation (for example), lose your hash.json, rerun generate-db, then try to import a duplicate copy of the original photo. The result will be two images will be imported: the one with the geolocation and the original one.
By embedding the original sha256 in the exif data, we can then make any modifications to that image and still reject any unmodified duplicate images that happen to be imported.
Thanks for clarifying. That makes sense. We are doing something similar with the original file name.
In order to skip importing the photo again we'd also need to store the original hash into one of the databases (hash.json or similar). Would like to capture all of the uses and incorporate them back into the original description of the issue.
Reopening.
Hi, I remember we had some discussion around this when I was doing the fixes for https://github.com/jmathai/elodie/pull/272
There's some comments in there:
https://github.com/jmathai/elodie/pull/272#discussion_r183207608
I did start to go down the road of storing the original hash in that pull request, but we backed it out as we decided to keep as is for that particular bug fix.
I am just starting to evaluate this interesting piece of software. My thoughts on this SHA issue was; how about checking the SHA of the image data itself without the EXIF data?
That’s an idea which could possibly be added as a plugin. #318 is nearly complete and should support this type of use case.
The library adds the original file name to EXIF. For the same reasons I wonder if there are benefits of storing the original checksum in EXIF. I will have to think about that more.
Either way utilizing a hash of the image data without EXIF is an interesting idea.
On Sun, Jun 30, 2019 at 1:26 AM Andrew Tolmie [email protected] wrote:
I am just starting to evaluate this interesting piece of software. My thoughts on this SHA issue was; how about checking the SHA of the image data itself without the EXIF data?
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/jmathai/elodie/issues/311?email_source=notifications&email_token=AAABR4FK3C22C7BV7HOBAVLP5BU2ZA5CNFSM4HL3E7K2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY4HX3I#issuecomment-507018221, or mute the thread https://github.com/notifications/unsubscribe-auth/AAABR4GEFW6GLA5ABIHLRWDP5BU2ZANCNFSM4HL3E7KQ .
-- -- Snet form my mobl phoone