script
script copied to clipboard
Generic HashSum and HashSums
This would replace #113/#114 and "depreciate" SHA256Sum and SHA256Sums by adding more generic functions called HashSum and HashSums.
One could call p.HashSum("SHA-256") or p.HashSums("SHA3-512") as opposed to adding specific functions for each hash algorithm desired.
This uses the crypto package and the official blessed algorithms within.
There are probably some extra tests (specifically passing an invalid hash name) and some docs to add - but I wanted to make sure I was on the right path here.....
SHA256Sum and SHA256Sums are replaced internally with calls to HashSum("SHA-256") and HashSums("SHA-256").
Finally, the internals are condensed and HashSum and HashSums use a common hash function to which they pass their io.Reader
Thanks, @wishdev! This is certainly more flexible than what we have at the moment.
I'm not sure if specifying the desired hash algorithm as a string is the right way to go. It would make sense to define these as constants:
script.Echo("hello").HashSum(script.SHA-256).Stdout()
However, this isn't valid Go, because identifiers can only contain letters. This is presumably why the constants defined in crypto
are spelled the way they are: https://pkg.go.dev/crypto#Hash
It also sets up a potentially irksome maintainer chore of keeping the list up to date with Go's crypto
library. So maybe we should just use its constants:
script.Echo("secret password").Hash(crypto.SHA256).Stdout()
Sounds perfectly reasonable. Updated...
Nice! Shall we now think about an example that demonstrates the need for HashSums
with a non-trivial script
pipeline?
'HashSums' isn't new functionality. It replaces SHA256Sums which is used for hashng files. You pass it a slice of file names and it will return the hash of each file. There is a test already setup for SHA256Sums within the test file.
'HashSums' isn't new functionality. It replaces SHA256Sums which is used for hashng files
Well, that's exactly the point of my question: why replace it?
I guess I'm just confused - it "replaces" the function in the same way HashSum replaces SHA256Sum - it allows one to select a different hash algorithm for the task at hand.
Both SHA256Sum and SHA256Sums still exist - they are just single line functions to HashSum(crypto.SHA256) or HashSums(crypto.SHA256).
I did move SHA256Sums down in the code base to get all the hashing functionality a little closer together - maybe that's leading to the confusion?
it allows one to select a different hash algorithm for the task at hand.
And in what kind of non-trivial script
pipeline would this be necessary? In other words, what we're looking for now is an example of a program that we couldn't write without the existence of HashSum
or HashSums
. We can use that to see if the API makes sense, or if it needs tweaking a little.
I also want to point out what an API concern with using crytpo.Hash
that I laid out in https://github.com/bitfield/script/issues/113#issuecomment-1111470432, which essentially means that we can use traditional hashes but not necessarily newer hash functions with parameters like scrypt
and others. I still think that using a generic hash like @wishdev has laid out is far more flexible and the use case function would be almost exactly like here https://github.com/bitfield/script/issues/113#issuecomment-1111470374 but with Hash(crypto.SHA512)
instead of my SHA512Sums()
. I can attest to regularly utilizing different hash functions depending on their availability more than necessarily a decision to select one.