s5cmd icon indicating copy to clipboard operation
s5cmd copied to clipboard

MD5 option for use with overwrite choice / sync

Open Krobar opened this issue 4 years ago • 4 comments

Would be great if this option could be added. I know it requires a custom metadata addition but it would be really useful.

Use Case: Using for copy of static site generator output to S3. S5cmd is way faster than alternatives but unfortunately copies files that don't need updating which makes it more expensive.

Krobar avatar Apr 07 '20 17:04 Krobar

Since we use multipart upload, object ETag changes if user changes part-size of a file. Relevant package: https://github.com/peak/s3hash/

It's not as safe as hash control, but cp -n -s practically does the same job for use-cases like this.

Duplicate of #43

igungor avatar Apr 12 '20 09:04 igungor

Thank you for the reply. I tried -s and it doesn't quite work for this use case. The reason is if I make a minor change to the page output (eg. Capitalise a letter) then the size does not change and it does not upload. -n is not appropriate for this use case as the generated files always have a new modified date than the previous files.

I don't think the ETag is reliable these days as it is no longer contains an MD5 hash of the upload. Some other (much slower) S3 utilities add a custom MD5 tag and check for this; this is not perfect but would work perfectly for this use case. Would be good if it could be considered.

Krobar avatar Apr 17 '20 14:04 Krobar

ETag isn't reliable. aws s3 sync has been reportedly broken for years as it doesn't guarantee an actual sync. See https://github.com/aws/aws-cli/issues/3273

Nowaker avatar Jun 08 '22 19:06 Nowaker

Can the approach taken by s4cmd not be used here? https://github.com/bloomreach/s4cmd#additional-technical-notes

kishaningithub avatar Apr 10 '23 13:04 kishaningithub