proposal: crypto: support 3rd-party hash algorithms
Proposal Details
crypto/rsa.SignPSS() and similar packages take a crypto.Hash, which is hard-coded to only support known-to-Go hash algorithms.
This is a total show-stopper for me, because I need to be able to support new, not-yet-existing hash algorithms.
I propose a new implementation for src/crypto/crypto.go:
// ...
type customImplementation struct {
name string
// int is used rather than uint8 of digestSizes in order to support future
// hash functions with large digests.
digestSize int
f func() hash.Hash
}
var customImplementations = make(map[Hash]customImplementation)
func (h Hash) String() string {
switch h {
// ...
default:
if implementation, exists := customImplementations[h]; exists {
return implementation.name
}
return "unknown hash value " + strconv.Itoa(int(h))
}
}
// ...
// Defines the limit of 1st-party hash functions.
const boundaryFirstParty Hash = 101
// ...
// Size returns the length, in bytes, of a digest resulting from the given hash
// function. It doesn't require that the hash function in question be linked
// into the program for hash functions known to Go authors. It does require
// that the hash function in question be linked into the program for 3rd-party
// hash functions.
func (h Hash) Size() int {
if h < 1 || h == maxHash {
panic("crypto: Size of unknown hash function")
}
if h < maxHash {
return int(digestSizes[h])
}
if implementation, exists := customImplementations[h]; exists {
return implementation.digestSize
}
panic("crypto: Size of unknown hash function")
}
// ...
// New returns a new hash.Hash calculating the given hash function. New panics
// if the hash function is not linked into the binary.
func (h Hash) New() hash.Hash {
if h < 1 || h == maxHash {
panic("crypto: requested hash function #" + strconv.Itoa(int(h)) + " is unavailable")
}
if h < maxHash {
f := hashes[h]
if f != nil {
return f()
}
}
if implementation, exists := customImplementations[h]; exists {
return implementation.f()
}
panic("crypto: requested hash function #" + strconv.Itoa(int(h)) + " is unavailable")
}
// Available reports whether the given hash function is linked into the binary.
func (h Hash) Available() bool {
if h < maxHash && hashes[h] != nil {
return true
}
if _, exists := customImplementations[h]; exists {
return true
}
return false
}
// ...
// RegisterCustomHash registers a function that returns a new instance of the given
// 3rd-party hash function. This is intended to be called from the init function in
// packages that implement hash functions. Registering hash functions outside of an
// init function can cause thread-unsafe behaviors.
func RegisterCustomHash(h Hash, name string, digestSize int, f func() hash.Hash) error {
if h <= boundaryFirstParty {
return errors.New("crypto: RegisterCustomHash within 1st-party region")
}
if len(name) == 0 {
return errors.New("crypto: RegisterCustomHash of hash function without a name")
}
if digestSize < 1 {
return errors.New("crypto: RegisterCustomHash of hash function with empty digests")
}
if f == nil {
return errors.New("crypto: RegisterCustomHash of hash function without an allocator")
}
if _, exists := customImplementations[h]; exists {
return errors.New("crypto: RegisterCustomHash of an existing hash function")
}
customImplementations[h] = customImplementation{
name: name,
digestSize: digestSize,
f: f,
}
return nil
}
// ...
(note that I would submit a pull request, but I am currently on a very shoddy hotspot internet connection that prevents me from performing the necessary tasks to do so.)
There is of course an issue with this proposal: 3rd-party hash algorithms might use the same value for crypto.Hash, which would lead to errors.
Ideally we would change the type definition of crypto.Hash to something more sensible than a uint, but this would break packages that use the uint to identify hash algorithms.
Perhaps a checksum of the hash algorithm's name to produce a uint would be better?
Related Issues and Documentation
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)
CC @golang/security
From your docs it seems like the RegisterCustomHash accepts a Hash parametr, so the caller has to decide on the value. What if two packages use the same one? I don't think that returning an error is okay, we should return a new Hash instead, from an auto-incrementing global variable.
From your docs it seems like the
RegisterCustomHashaccepts a Hash parametr, so the caller has to decide on the value. What if two packages use the same one? I don't think that returning an error is okay, we should return a new Hash instead, from an auto-incrementing global variable.
I mentioned the possibility of this issue at the end of my proposal.
An auto-incrementing counter would guarantee uniqueness but would cause issues across machines with differing loaded 3rd-party hash algorithms when using the uint as an identifier.
Perhaps crypto.Hash could be made into a uint64 and 3rd-party hash algorithms would have their names be checked to be ASCII-encoded; the first seven bytes of the name could be used to create a uint64 (with zerod-bytes being used if all seven bytes are not available), with one byte being reserved for 1st-party hash algorithms.
I do not know how much this would affect backwards-compatibility.
Can you remove the parts that are unchanged? it makes it difficult to identify the actual proposed changes.
Can you remove the parts that are unchanged? it makes it difficult to identify the actual proposed changes.
Done, // ... is used to indicate unchanged content.
Note, I edited the original post:
-```
+```go
This adds syntax highlighting.
This is an extremely powerful feature, which seems extremely easy to misuse if you aren't really sure what you're doing, and seems likely to be of value to a very small number of people. As such I'm not sure it's something we should consider adding.
SignPSS is generally hash agnostic, but there are a large number of places where we do more complex things based on which hash was used, and expect it to be one of our predefined hashes. Documenting where this would work, and wouldn't, seems like it'd be rather complicated, and add additional overhead, with questionable value to the vast majority of Go users.
You say you "need to be able to support new, not-yet-existing hash algorithms", can you provide some more context for this? Are you experimenting with proposed hashes, or is this necessary for production systems? It seems like this is likely to be a case where simply working off a fork, with your custom hashes implemented, is probably the best option.
which seems extremely easy to misuse if you aren't really sure what you're doing
crypto.RegisterHash() is also easy to misuse (the documentation does not mention that it is only intended to grant access to the defined hash algorithms).
seems likely to be of value to a very small number of people
I agree, but Go has been regarded as a crypto-friendly language, and having limitations like this can be off-putting, even if done with good intentions.
but there are a large number of places where we do more complex things based on which hash was used, and expect it to be one of our predefined hashes
I understand there are places where custom hash algorithms cannot work, but there are also places where they can, and not supporting those using standardized interfaces raises the difficulty of building certain things in Go.
can you provide some more context for this?
Mostly experimentation with a virtual machine that can receive crypto-related algorithm updates validated via already-known signature algorithms. The idea is to support extreme longevity when traditional system updates are not possible.
I understand this is very exotic and that a custom implementation of RSA or whatever could be written off-the-bat, but bootstrapping with crypto/rsa is a bit easier; if a new hash algorithm comes out and using it with RSA is desired, then interaction with the bootstrapped crypto/rsa demands being able to customize crypto.Hash.
It seems like this is likely to be a case where simply working off a fork, with your custom hashes implemented, is probably the best option.
I guess this makes sense.
but Go has been regarded as a crypto-friendly language
I believe the reputation comes from making it difficult to misuse, not being from extensible.