cosmos-sdk
cosmos-sdk copied to clipboard
Can we change ImportGenesis & ExportGenesis to return a proto.Message interface
Summary
Something thats always confused me is why does ImportGenesis and ExportGenesis in every module take in a JSON interface.
We can definitely change ExportGenesis to return a proto.Message interface, that the SDK then serializes into a JSON? Seems odd from an API and complexity perspective that we have to serialize this in our module.go right now.
Something I'm less sure about: Can we change ImportGenesis to take in a proto.Message, and then ImportGenesis does a type cast? We likely still need the codec, to deal with unpacking proto any types.
Is this an acceptable API break? (It could technically be done via versioned extension interfaces, and making modules one of the new or old init genesis methods) Theres not really a way to do this thats not an API break, so it may be annoying for folks.
Problem Definition
Just an idea for reducing the JSON serialization code in each module, and getting serialization complexity handled at a higher layer of abstraction.
Proposal
Make ImportGenesis take in a proto.Message instead of json.RawMessage, and ExportGenesis return a proto.Message
Rework module interfaces, to push this to an extension interface, and allow modules to implement this or the legacy one.
cref #12295 - this would also allow that to occur, with no further complexity to the module writer
For Admin Use
- [ ] Not duplicate issue
- [ ] Appropriate labels applied
- [ ] Appropriate contributors tagged
- [ ] Contributor assigned/self-assigned
Yeah I think this is just a legacy thing. Meaning, we used json.RawMessage for ages now and then introduced Proto later and never really revised this API surface area. Totally agree that we should use proto.Message for these APIs.
@julienrbrt or @facundomedica is this something you find interesting to work on?
I'm not at all in favor of these API "improvements" that introduce breaking changes without solving longstanding issues like streaming genesis. This in particular breaks the ORM
I don't think streaming genesis ever made sense as a priority feature, given a profiler of where InitGenesis time is.
The entire focus should be around IAVL commit & db interaction semantics. All unmarshalling from the genesis file is ~irrelevant in time from profiles I've done. What matters is improving the write to IAVL + LevelDb on commit, which is where optimization cyclers should go.
Can you expand a bit on why this breaks ORM? Does ORM not use proto for genesis?
The ORM uses proto for genesis but it wraps it in a map which can't be serialized as proto. So it needs RawJSON. The ORM also does support streaming genesis (although the SDK currently has no way of leveraging this)
We agreed that we'd get this feature in via additive APIs so it's not breaking ORM.
Rework module interfaces, to push this to an extension interface, and allow modules to implement this or the legacy one.
If the ORM requires json.RawMessage, and we're planning to further develop the ORM, doesn't that mean we'd never abandon the "legacy" json.RawMessage API? Having 2 APIs for Genesis seems more confusing to me than one.
I feel like ORM should be changed to not depend on json.rawmessage structure for init genesis tho, this feels like it should be fixable
We already agreed to support streaming genesis for both ORM and collections and that's the API that's been adopted for genesis in cosmossdk.ui/core. I'm wondering though whether we should make it a stream of proto messages so that the underlying data can be either json or proto binary. That might address both memory and performance concerns
The advantage of JSON is that it's easy to patch using jq or other tools (eg to change block time etc...). With good tooling we can do the same with proto.