interchain-security Rethink byte prefixing scheme

Protocol Change Proposal

EDIT: Use the solution propose in https://github.com/cosmos/interchain-security/issues/939#issuecomment-1549162175.

Summary

We currently use Go's iota operator to choose byte prefixes for our database keys. This is clean looking, but it can be brittle. If someone changes the order of the keys in the file, or adds or removes a key, it changes the prefixes of all following keys. This means that simple refactors can require database migrations. There are many alternatives but all of them have tradeoffs. I will discuss them here and we can decide on whether to change what we do.

Proposal

Key prefixing techniques:

iota
- Pros
  - Easy to write and looks clean in the code
- Cons
  - Changing order of keys or removing keys requires a DB migration
Write out bytes, e.g. []byte{0x04}
- Pros
  - Can move keys around and remove keys without requiring a migration
- Cons
  - Once you move the keys, it will be hard to see which bytes have been used, resulting in a similar problem. At that point it's very easy to accidentally reuse a byte. Not much better than iota in this respect, and messier.
Use strings, e.g. KeyChannelEndPrefix = "channelEnds"
- Pros
  - Very easy, looks very clean, no issues rearranging keys
- Cons
  - Is vulnerable to an injection attack in certain circumstances if the attacker can choose the key segment that follows the prefix and cause a collision with a different prefix.
Use strings with a separator
- Pros
  - Solves the collision/injection attack issue
- Cons
  - Need to sanitize and escape any user supplied values that could end up in a key, otherwise they could insert the separator for an injection attack. This requires some more boilerplate and complexity.
Use strings hashed to a fixed length
- Pros:
  - This requires a little bit of boilerplate for the hashing function but is much less complex than the separator sanitization boilerplate required by the solution above.
  - Completely solves collision, injection and accidental reordering concerns.
- Cons:
  - Has a bit more boilerplate. To completely avoid any possibility of collision the hashes should probably be 20+ characters which uses more disk space than any of the byte solutions, but not really that much more than the string based solutions.
Use iota, but simply have a rule to only ever append to the list. Might help to have the keys in a separate file to prevent people from being tempted to refactor. Having a separate file might also facilitate an automated tool that would run on CI and error if keys had been rearranged.
- Pros:
  - Keeps the keys small with 1 byte prefixes
  - No migration will be necessary right now because this is basically the same thing as we already have.
- Cons:
  - Doesn't exactly solve the problem, the prefixes file could get huge and messy, and if we use 1 byte prefixes we will run out after 256 keys (but this isn't any different than the other byte prefixing methods)

For Admin Use

[ ] Not duplicate issue
[ ] Appropriate labels applied
[ ] Appropriate contributors tagged
[ ] Contributor assigned/self-assigned
[ ] Is a spike necessary to map out how the issue should be approached?

May 10 '23 15:05 jtremback

Either of the last two options seem solid to me

May 11 '23 17:05 shaspitz

Use iota, but simply have a rule to only ever append to the list. Might help to have the keys in a separate file to prevent people from being tempted to refactor. Having a separate file might also facilitate an automated tool that would run on CI and error if keys had been rearranged.

I like this solution as it's the simplest to adopt. Basically, we need better testing around keys. Any idea of a test ensures that keys are only appended?

May 16 '23 07:05 mpoke

Write out bytes, e.g. []byte{0x04}

Having a single keys.go file with all the keys can enable also this solution. We could check then that all the keys are unique. However, if somebody adds a key to keys.go, but not to keys_test.go, then we have a problem.

May 16 '23 07:05 mpoke

What about using a constant map?

func getKeyPrefix() map[string]byte {
    return map[string]byte{
        "PortByteKey": 0,
        "MaturedUnbondingOpsByteKey": 1,
        "ValidatorSetUpdateIdByteKey": 2,
    }
}

func PortKey() []byte {
	return []byte{getKey("PortByteKey")}
}

Like this we can easily test that the map contains only unique values. Also, it's only a refactor.

May 16 '23 07:05 mpoke

@mpoke this does make a possible catastrophic key rearrangement impossible. It doesn't solve the problem 100%, since there's still the potential for the numbers to become scrambled and have gaps as refactors happen. But that will just cause compile errors and the worst case scenario is someone having to try again with a different number as they're writing the code and get an error.

To me, either this approach, or the one I tried here would be satisfactory. I'll leave it to you to make the final call.

May 22 '23 22:05 jtremback

Reopening issue to deal with consumer changes.

Aug 05 '24 12:08 mpoke