keepsake icon indicating copy to clipboard operation
keepsake copied to clipboard

Use of MD5 hash for file paths

Open KushalP opened this issue 5 years ago • 2 comments
trafficstars

It looks like MD5 hashes are being used for file paths.

Given what has already been learned about this algorithm, is it worth using something more collision resistant like SHA256 or SHA3?

KushalP avatar Nov 19 '20 20:11 KushalP

Something else that might be useful if this is going in a direction that I think it's going in: https://github.com/benlaurie/objecthash

KushalP avatar Nov 19 '20 21:11 KushalP

The MD5 hash is being used to check whether a file has changed, not for file paths. Currently path paths / IDs are just randomly generated.

Both S3 and GCS use MD5 hashes, so it's like that in the disk implementation merely for consistency. Maybe we could use SHA hashes though.

We would like to do content addressable hashes at some point though. See #307

bfirsh avatar Nov 19 '20 21:11 bfirsh