apps-android-commons
apps-android-commons copied to clipboard
[Bug]: Two checksums for the same image - upload via upload wizard and commons app
Summary
The same image can be uploaded twice to Commons, once using the Upload Wizard and once using the commons app (duplicate). Presumably, different checksums are created depending on the upload tool.
Steps to reproduce
- Upload via Upload Wizard
- Upload the same image (other file name) with commons app
Expected behaviour
Upload Wizard or Commons App inform me that there is a duplicate
Actual behaviour
See:
Category:Two checksums for the same image - upload via upload wizard and commons app.screenshots
https://commons.wikimedia.org/wiki/Category:Two_checksums_for_the_same_image_-_upload_via_upload_wizard_and_commons_app.screenshots
Device name
Google Pixel 7 Pro
Android version
Android 14
Commons app version
5.0.2~05ffd123e
Device logs
No response
Screen-shots
See:
Category:Two checksums for the same image - upload via upload wizard and commons app.screenshots
https://commons.wikimedia.org/wiki/Category:Two_checksums_for_the_same_image_-_upload_via_upload_wizard_and_commons_app.screenshots
Would you like to work on the issue?
None
See also: https://github.com/commons-app/apps-android-commons/issues/5798
See also: https://commons.wikimedia.org/wiki/Commons:Village_pump/Proposals/Archive/2022/07#Duplikat-Erkennung:_Commons_Hochlade-Assistent_und_Commons-App
Anyone willing to try and fix this bug?
Ideally, the Commons app should check whether any of these two checksums get a match using the Wikimedia server's search API:
- The checksum of the image that the user is about to upload.
- The checksum of the image that the user is about to upload after applying the user's configured EXIF anonymizations.
I want to try :hand:
Thanks Parneet! First, would you mind posting here a link to the code that checks whether an image is already on Commons or not? :-)
https://github.com/commons-app/apps-android-commons/blob/64fd10d00e8ba5bd220db32740247c6b00e3cd8e/app/src/main/java/fr/free/nrw/commons/media/MediaClient.kt#L51
This above is used through some logic in UploadPresenter
This PR #5570 did remove a piece that I suspect might be the issue, can't say for sure..
Great find!
Now you will have to call that function a second time, with the SHA of the original file (before EXIF modification).
Just to confirm, we just want the checksum that app generates to be same with web's upload wizard one, so it can properly prompt the user that this is a duplicate?. And this has nothing do with blocking duplicate uploads? (BTW is it allowed to upload duplicates?)
The website does not modify files, so the checksum of a file uploaded via the website is the same as the original file's checksum.
The API does not prevent uploading duplicates, but our app should do whatever it can to prevent this from happening.
Thanks! I found the issue, basically when checking for duplication before upload the checksum that is generated is different from the web one generates. Also, when the actual upload happens checksum is again generated but this time it matches with the web one. Difference was we pass two different android.net.Uri instances in these two cases, I'm gonna find the why and make the changes.
Now you will have to call that function a second time, with the SHA of the original file (before EXIF modification).
And where is the first time we should be calling this function?