multicodec
multicodec copied to clipboard
feat: add aes cipher multicodecs
Followup from today’s IPLD call.
Working on an implementation of block encryption and I need to assign multicodecs for the main AES ciphers.
I think this is a good range to start adding ciphers in but if anyone has a better idea I’m open to it.
So you're making a generic "cipher" type with entries that could be used in a variety of ways, but you're specifically using them as IPLD codecs to encode and decode specific forms, iv+bytes
. Maybe these entries should just be specific ipld entries with ipld
in that middle column and names that indicate their use as codecs?
I wouldn’t consider them ipld
types. When they are using in CID’s, then IPLD will apply specific block codec semantics, but that’s sort of IPLD’s decision.
If someone wants to refer to these ciphers in some other context, outside of CID’s, and needs an int, hex or varint, then they should be able to reuse this code.
Tag in the codec table
It seems bizarre to me to encourage the reuse of 0x1400
to mean both aes-gcm
generally and also decide that there is a particular serialization codec that is referred to by 0x1400
. There is more than one way to encode an aes-gcm encrypted piece of data and this just claims one of them as "the" one.
I understand the claim that this codec is so simple that it in a sense "must" be correct, but why bother taking the risk and dealing with the associated fallout when we could just call it an IPLD (or serialization since there are no links) codec and if in the future anyone finds a use for an aes-gcm
abstract value they can claim it on the repo and we can talk about it then.
Overall
I've already expressed my concern for adding these codecs at all offline (and summarized in https://github.com/ipld/specs/pull/349/files#r559963156). It's a 🤷 from me as my understanding of the general policy of the code table is to not object if someone makes a request for a code as long as the number of codes being requested isn't large and the code size is sufficiently large for the request. Instead we just try and discuss the options and guide people down paths we think will be helpful to them. I feel like I've done that and so the rest is up to you 😄.
Ok, I’ve updated this PR to be in line with my most recent comment.
https://github.com/ipld/specs/pull/349#issuecomment-763858165
This should address the comments here as it collapses the use case into a single block format. There are still multicodec entries for each cipher but thats for general use (and use within the block format itself) meaning that the table entries for AES variants no longer correspond to a block format and do not have a meaning when used as a the codec identifier in a CID.
should be split into aes, gcm, cdc, ctr
One might be able to make the argument that aes
and aes-gcm
are both valid ciphers, it's just that one is better for encrypting multiple chunks/blocks with the same key. I'm fine with having them specified the way that they are. However, maybe we do need to clarify key size in here! AES can do 128, 192, and 256 bit key sizes. 128 and 256 being the most common (both are specified as standard in TLS v1.3 w/ GCM).
The way it's been implemented at the moment @ https://github.com/multiformats/js-multiformats/pull/59/files#diff-d971711275e038fa06255a059bdaa381debc468ec6473d31cba5d230beef949aR50 leaves it up to the user, whether they supply a 16-byte or 32-byte Uint8Array
key (see), it'll switch between the two. Maybe it's nice to leave it completely optional for the user to work out? But maybe also it would be better to be explicit about what cipher is being used (I'm in favour of defining "cipher" in a way that makes distinctions between the key sizes since they usually have major internal differences - for AES I think it's the number of rounds being the main difference).
aes-256-gcm
, aes-128-gcm
, etc.?
aes-256-gcm
,aes-128-gcm
, etc.?
This is really a rabbit hole. I'm not deep enough (and hopefully won't get any deeper) to have a strong opinion about it. Though having three independent items that can be combined together in several different ways sounds a bit like different Multicodec codes/size information. But that may well be overkill/bloat things.
Another option could be to define them as bit flags and then use the resulting values as Multicodec code. I haven't really thought this idea through, hence I'm unsure if this would be viable.
aes-256-gcm, aes-128-gcm, etc.?
This is really a rabbit hole. I'm not deep enough (and hopefully won't get any deeper) to have a strong opinion about it. Though
TL;DR
In case this is helpful, I think that this PR as it stands is good to go 👍
Ciphertexts
As an end user, the main thing that I'm interested in knowing is that a ciphertext was encrypted with AES in a particular mode (CTR, GCM, etc). I don't really care about the key length; presumably I know that if I have the key.
It would be nice if this format also defined where the IV sat, and possibly its length (usually 96-bits, but I've seen 128 in the wild).
Keys
If you want to describe the keys themselves (sounds dangerous so not sure why you'd want to), I don't need to know what mode it will be used in. I only care about the key type, and I suppose the length (though it's usually so short that I could just take the length)
It would be nice if this format also defined where the IV sat
How do you propose that's done here? If the length, and key is entirely a user concern, shouldn't the IV be too?