hermes
hermes copied to clipboard
Investigate new event structure of tendermint-go v0.35
Context
This issue is about a certain breaking change in one of our dependencies. For a full picture of Hermes dependencies, refer to https://github.com/informalsystems/ibc-rs/discussions/2030.
Problem Definition
Tendermint-go has changed the structure of the events they emit through the websocket. Ref: https://github.com/tendermint/tendermint/pull/6634
This feature is released in the tendermint-go 0.35.x line (changelog), which will appear in SDK v0.46.x line (milestone). This probably means ~~late Q1 2022~~ late Q2 2022.
Further context on the upstream changes, thanks to @creachadair:
The Event and EventAttribute structures that Tendermint uses are defined as protobuf messages, e.g., [1]. Those messages can be encoded either in the protobuf wire format (binary) or JSON (text). The binary encoding does not change as a result of the schema change, but the text encoding does change. [1]: https://github.com/tendermint/spec/blob/master/proto/tendermint/abci/types.proto#L276
.. if the IBC code uses the websocket, then yes, it will get events in JSON representation. And in that case, it will be affected by the schema change.
More context: the events
field structure changes from a hash-map to a vector:
https://github.com/informalsystems/tendermint-rs/blob/master/rpc/src/event.rs#L21
Proposal
We need to run Hermes against a chain with SDK 0.46 (once that is released) and make sure any breaking changes in the event system of tendermint-go do not affect Hermes. If those changes break event parsing in Hermes, we have to adapt our code to the upstream changes.
Currently, marking this issue as blocked until an SDK 0.46 appears.
Acceptance Criteria
- [ ] test Hermes against chains built on SDK 0.46.x
- [ ] upgrade Hermes event parsing as necessary to maintain compatibility with tm-go 0.35
- [X] ~~ensure backwards compatibility with SDK pre-0.46.x~~
- ~~AZ: This should be fine: we should be backwards compat.~~
For Admin Use
- [X] Not duplicate issue
- [X] Appropriate labels applied
- [X] Appropriate milestone (priority) applied
- [X] Appropriate contributors tagged
- [X] Contributor assigned/self-assigned
No opportunity to check this with a build of Gaia yet, but from the changes referenced above it looks as though the difference will be limited to the format of the event structure, with the events retrieved explicitly as Vec<abci::Event>
. We can explore building against the changes for tendermint-rs 0.35 in a branch.
Two updates we received from upstream dependencies:
- Carlos is in the process of building ibc-go (simapp in there) with cosmos-sdk 0.45 + tm 0.34
- https://github.com/cosmos/ibc-go/tree/carlos/upgrade-pr-branch
- Billy has
okwme/rho-base-main
branch in gaia is now compiling and can be used in tests- https://github.com/cosmos/gaia/pull/1436
Unclear which is more appropriate for our tests between the WIP gaia and the WIP simapp from ibc-go, we should investigate and try with both probably.
As cosmos-sdk has switched to buf for code generation, it would make sense to investigate usage of protoc-gen-prost, as prost-build can no longer easily catch up to the details of the build process for the .proto files.
There is also cosmos-sdk-proto, and I think we should replace ibc-proto
with using that crate as a dependency, to save us the pain and put a stop to proliferation of redundant proto-generated modules (tendermint-proto I'm looking at you). However, it does not yet have a branch or release compatible with tendermint-rs 0.24.x, so it's not currently useful for solving this issue.
After negotiating with Mikhail, we agreed on this battle plan:
- Check feasibility of adapting our
ibc-*
crates to usingcosmos-sdk-rust
- Check if we can do buf-based generation in
cosmos-sdk-rust
- Maybe we won’t even need to do buf-based generation there, maybe old-fashioned clone+compile can do the job
- What we’re after is sdk files from tags
0.46.*
- Switch
ibc-*
crates away fromibc-proto
to using cosmos-sdk-proto- Deprecate
ibc-proto
crate, eventually delete - Deprecate
ibc-proto-compiler
and eventually delete - Find workaround to maintain
ibc.mock.proto
files around, probably in our own repo
- Deprecate
-
Backup plan: in case point (1) is too costly: investigate buf-based conversion in our own repo, abandon use of
ibc-proto-compiler
and abandon the idea of switching tocosmos-sdk-proto
(at least for the moment)
we agreed on this battle plan:
I have failed early on step 1 as described in https://github.com/informalsystems/ibc-rs/issues/425#issuecomment-1117967281, so proceeding with step 4.
I have failed early on step 1 as described in #425 (comment), so proceeding with step 4.
I see, the derive tags seem to be the main issue. Thank you for keeping us in the loop Mikhail, sounds like a plan!
Working on informalsystems/hermes#2213, still can't build the relayer due to more work needed on tendermint-rs 0.24 API changes.
However, https://github.com/informalsystems/ibc-rs/pull/2213/commits/9236ffbafb433b06b0060e36be2313b32e1feeca has the adaptation to the new event API. As the commit shows, it allows us to simplify the code for processing RPC events, getting rid of the extract_attribute
/RawObject
infrastructure.
Adaptations to tendermint-rpc
0.24.x (implementing Tendermint RPC 0.35), have been explored as part of informalsystems/hermes#2213. That PR will not go ahead in the present form due to changes in upcoming Cosmos-SDK, reverting to Tendermint 0.34.
An attempt to accommodate RPC nodes using 0.34 and 0.35 side-by-side has been started with https://github.com/informalsystems/tendermint-rs/pull/1143, but we also have a choice to emulate 0.35 subscription API for all RPC clients, as described in https://github.com/informalsystems/tendermint-rs/pull/1143#issuecomment-1155354118.
Thanks for the wrap-up Mikhail!
but we also have a choice to emulate 0.35 subscription API for all RPC clients
What would be your recommendation between the side-by-side approach vs. emulate 0.35 API approach? I read the https://github.com/informalsystems/tendermint-rs/pull/1143#issuecomment-1155354118 but it's unclear to me if the "emulate v0.35 API" solution is the better.
I would favour the 0.35 emulation approach because it simplifies code of the relayer (as well as any other application using tendermint-rpc
). The old Subscription
/Event
API is the only substantial difference, and it necessitates different processing code to extract event data from the subscription stream: the RawObject
infrastructure that was shown to be removable in informalsystems/hermes#2213.
This, however, rests on the assumption that nodes producing 0.34 subscription events always encode them from a source list of uniform ABCI-ish events (where each event of a particular type always carries tags with the same set of names), so they can be recovered without corruption or data loss. As far as I can see, the relayer already relies on this assumption here: https://github.com/informalsystems/ibc-rs/blob/0afaea644f69819e6f53cbc35c9a651c50a7ffa9/relayer/src/event/rpc.rs#L181.
This, however, rests on the assumption that nodes producing 0.34 subscription events always encode them from a source list of uniform ABCI-ish events (where each event of a particular type always carries tags with the same set of names)
I don't think the spec requires any particular order or separation between types and attribute names, but I believe in practice applications generally are at least self-consistent.
The change of representation doesn't add (or remove) any constraints, so unless an app developer intentionally changes the ABCI metadata scheme the app uses the results should be the same.
@thanethomson Will this still be needed now that 0.35 is going to be yanked?
Will this still be needed now that 0.35 is going to be yanked?
As per https://github.com/tendermint/tendermint/issues/9091, @cmwaters recommended that we look at reintroducing this at some point in future (specifically this PR: https://github.com/tendermint/tendermint/pull/6634). There's no clear timeline for this at present.
Given more recent conversations around this event structure, @romac and @adizere does it perhaps make sense to first have a design session around this with the Tendermint Core team to see if there's a more appropriate/useful way to represent events for IBC?
Given more recent conversations around this event structure, @romac and @adizere does it perhaps make sense to first have a design session around this with the Tendermint Core team to see if there's a more appropriate/useful way to represent events for IBC?
Yeah that would be great!
We will discuss this synchronously as part of a design session.