keepsake
keepsake copied to clipboard
Use of MD5 hash for file paths
It looks like MD5 hashes are being used for file paths.
Given what has already been learned about this algorithm, is it worth using something more collision resistant like SHA256 or SHA3?
Something else that might be useful if this is going in a direction that I think it's going in: https://github.com/benlaurie/objecthash
The MD5 hash is being used to check whether a file has changed, not for file paths. Currently path paths / IDs are just randomly generated.
Both S3 and GCS use MD5 hashes, so it's like that in the disk implementation merely for consistency. Maybe we could use SHA hashes though.
We would like to do content addressable hashes at some point though. See #307