`hash` aspect of license clarification is not secure enough
Consider this:
[[licenses.clarify]]
name = "ring"
expression = "LicenseRef-ring"
license-files = [
{ path = "LICENSE", hash = 0xbd0eed23 },
]
Just from the length of the hash value, we can see that this isn't a secure hash. It seems like it wouldn't be too surprising for there to be even accidental collisions in the hash value. Also, reading the source code I see a non-secure hash algorithm is used. It could be important for a user who is trying to satisfy legal requirements for license compliance to use a secure hash, even though it might otherwise be considered overkill. Plus, I think it is just good practice.
I suggest the following improvement: 0. Implement a new style of hash that is a full SHA-384 of the license file.
- Change the documentation and all examples to use these new SHA-384 hashes.
- (Optional) Deprecate the old style hashes. Warn when they are encountered and suggest the new style of hash with the appropriate value, but still report success. Provide a command-line flag to force the use of the strong hashes.
- At some point, perhaps immediately, remove support for the old style hashes.
I would be happy to help review a change of this sort.
Thanks @briansmith ! Agreed this needs to be a proper secure hash and longer. It doesn't make it harder to use and for those that do rely on it it for compliance it would massively reduce the chance of collisions and give confidence.
@Jake-Shadle how feasible do you think it would be to introduce a new hash and deprecate the old one over a few versions or remove the old? if we fully remove the old it would be a breaking change for existing configurations and would require a new GitHub action major version also. But first step could be to implement, deprecate and warn. Do like @briansmith 's idea of a command-line to enforce/verify the usage of the new flag for those that need it
Could this field just accept a prefix to make it clear? e.g.
{ ... license = "sha256:3a7bd3e2360a3d29eea436fcfb7e44c735d117c42d1c1835420b6b9942dd4f1b" }
This would allow support for multiple algorithms as well