django-s3file icon indicating copy to clipboard operation
django-s3file copied to clipboard

Files with identical base names cause storage file collision

Open tunecrew opened this issue 2 years ago • 5 comments

Uncovered an interesting behaviour/bug? when uploading multiple files that have the same filename. Behaves as follows:

  • From a file upload form with multiple set to True select 2 or more files with the same exact file name (this is possible on MacOs at least if the files are in different folders, but the file chooser modal is set to a parent folder of these folders - see screenshot for example).
  • Only one of the files is actually uploaded to the tmp folder.
  • If you select n different files, then n copies of the single uploaded file are copied from the tmp folder to their final destination, so instead of n different files with the same name, you end up with n copies of just one of the files.

I haven't dug in deeper yet.

Screenshot 2022-04-10 at 8 59 47 PM Screenshot 2022-04-10 at 9 00 50 PM

tunecrew avatar Apr 11 '22 02:04 tunecrew

Hi @tunecrew,

Thank you for reaching out. I am curious, is that happening on S3 or in local development? If it's happening on S3, is you DEBUG setting on or off?

Meanwhile, I will try reproducing this.

Best Joe

codingjoe avatar Apr 11 '22 07:04 codingjoe

Hm… so, I did some investigation. HTTP allows multiple files with the same name in a single request. However, since we don't rename files, this approach does not work currently. We'd need to introduce subfolders for each file, to mitigate this issue.

I will work on this, since I found something related to this, that also needs my attention.

codingjoe avatar Apr 11 '22 08:04 codingjoe

Hi - it happens locally (haven't tried it in production yet) - but my local dev environment is dockerised and uses Minio for S3, so I think the behaviour should mimic a production environment.

A couple thoughts based on your response:

  • Interesting that it is copying the same file n times to the final destination.
  • Could this be fixed by adding a suffix (timestamp + random) to each filename as it is saved to tmp - that way each folder would still represent the logical grouping of a single upload user action.

tunecrew avatar Apr 11 '22 21:04 tunecrew

Hi @tunecrew, I never took the time to properly thank you. Your bug report helped uncover a pretty substantial security bug. I credited you in the CVE, so you've probably seen it. Still, thank you again for. That was really helpful. However, I believe the patch did not solve your particular issue. Did you happen to make any progress on this meanwhile, or do you want me to have a got at it? Cheers, Joe

codingjoe avatar Sep 13 '22 08:09 codingjoe

You're very welcome! I haven't revisited it in a bit as I got sidetracked on another project, but I will be back on it this fall so I'll let you know.

tunecrew avatar Sep 13 '22 14:09 tunecrew