js-multiformats
js-multiformats copied to clipboard
Proposal: Update MultihashHasher interface so it could support multiple hashing algorithms
Goal
As per https://github.com/ipld/js-car/issues/123 we do need a solution for verifying hashes in CAR files without shipping all the MultihashHasher
-s with the CAR library.
Proposal
- Introduce backwards compatible change to the
MultihashHasher
by introducing second OPTIONALcode
argument to thedigest
method.- If omitted it would default to own
code
. - If different code is passed it would error.
- If omitted it would default to own
- Rename
MultihashHasher
interface toMultihashHasherCase
and repurpose it for single algorithm use cases. - Introduce multi algorithm
MultihashHasherVariant
interface with the samedigest
method and a method that returns map ofMultihashHasherCase
it's comprised of. - Define
MultihashHasher
as union type discriminated bycode
field.
Here is the the sketch:
interface MultihashHasherCase<Code extends number = number> {
code: Code
digest(bytes: Uint8Array, code?: Code): Digest<Code>
}
interface MultihashHasherVariant<Code extends number = number> {
// We define `code` as optional which can never have a value. This will allow
// us to `code` as a discriminant in the union
code?: never
cases(): Record<Code, MultihashHasherCase<Code>>
digest <CodeCase extends Code> (bytes: Uint8Array, code?: CodeCase): Digest<CodeCase>
or <Case extends number> (hasher: MultihashHasher<Case>): MultihashHasherVariant<Code|Case>
}
type MultihashHasher<Code extends number = number> =
| MultihashHasherCase<Code>
| MultihashHasherVariant<Code>
Note that above makes current MultihashHasher
-s compatible with proposed MultihashHasher
type that is to say code that will require new type will accept all existing hashers without changes to them.
- With such a type in place
js-car
library would be able to requireMultihashHasher
parameter in order to be able to verify digests. - Added
MultihashHasherVariant
would allow composing many hashers into one.
OK, I like the direction, the details are worth discussing though:
- I don't think this is actually backward compatible, is it? Current
MultihashHasher
looks like this:
export interface MultihashHasher<Code extends number = number> {
digest: (input: Uint8Array) => Promise<MultihashDigest<Code>> | MultihashDigest<Code>
name: string
code: Code
}
export interface SyncMultihashHasher<Code extends number = number> extends MultihashHasher<Code> {
digest: (input: Uint8Array) => MultihashDigest<Code>
}
Did you just leave off the Promise
return and name
for brevity, would they be in the final form? And would MultihashHasherVariant
get a name
?
-
Why do we really need the two separate forms? Could we just make them all composable with an
or
? How would this work when you're starting off with nothing, you have to have an empty Variant that you canor(hasher)
on? And what do you see as the utility ofcases()
? Do we need that? -
Could we work on the naming a bit? "Case" is a bit too out of place for a JS/TS code isn't it and is likely to be something that folks trip over. If we could come up with something a bit more idiomatic and less Rusty then it might be easier for outsiders to wrangle. Even "Variant" is a bit odd. I reach for words like "Impl", "Alg", "Concrete", "Actual", "Single" for what you're calling "Case". And "Variant" is more of a "Bundle", "Collection", "Group", "Multi". I'm having a hard time seeing documentation explain this in a straightforward way with the current naming, without having to also explain the naming.
-
Sync version, do we need to work on one as well to cover all of this? Is that going to end up multiplying the pain of implementing hashers?
Did you just leave off the Promise return
That was an accident, I didn’t mean to change return type
and name for brevity
We can keep name
field, but I do find it on the interface unnecessary and would suggest dropping unless we find it’s actually needed.
Why do we really need the two separate forms?
Every time I try to hide distinction I find myself regretting it for one reason or the other. I think having two distinct when you care and third unified (type union) when you don’t is the best compromise.
?Could we just make them all composable with an or?
We could impose or
on MultihashHasherCase
but that would imply every hasher alg now needs to implement that as well or import utility. I think exposing empty
variant from this lib is a better option. That way external defs only need to implement one function and allow user to import this lib if they need to compose. If they want to implement variant as well they’re free to do so as well
How would this work when you're starting off with nothing, you have to have an empty Variant that you can or(hasher) on? And what do you see as the utility of cases()? Do we need that?
so you could decompose it into individual hashers. Also makes it possible to have or
static function that is variant implementation agnostic
Could we work on the naming a bit? "Case" is a bit too out of place for a JS/TS code isn't it and is likely to be something that folks trip over. If we could come up with something a bit more idiomatic and less Rusty then it might be easier for outsiders to wrangle. Even "Variant" is a bit odd. I reach for words like "Impl", "Alg", "Concrete", "Actual", "Single" for what you're calling "Case". And "Variant" is more of a "Bundle", "Collection", "Group", "Multi". I'm having a hard time seeing documentation explain this in a straightforward way with the current naming, without having to also explain the naming.
Sure I’m fine with whatever names. Out of proposed ones I’d pick
ConcreteMultihashHasher
and CompositeMultihashHasher
P.S.: I would use Multi prefix if it was not overloaded in this context
Sync version, do we need to work on one as well to cover all of this?
Yes, I just left out to reduce noise
Is that going to end up multiplying the pain of implementing hashers?
I don’t think it would even though it would multiply number of types.