ibc
ibc copied to clipboard
ICS-004: Add weakly ordered packet semantics
ICS-004: Add weakly ordered packet semantics
Currently channels must be one of ordered and unordered. The current ordered channel, is really a "Totally ordered channel". Censorship resistance motivates a use case for a 'weakly ordered' channel. A weak ordering is a total ordering, but admits "ties". https://en.wikipedia.org/wiki/Weak_ordering
I've been using the verbage "partial ordering" for this idea in numerous irl discussions of the idea. The correct term is weak ordering, which can be understood as a partial ordering where "incomparable items" are all treated as equivalent for ordering purposes ('tied').
Motivating use case
The idea of weak ordering has use cases for both censorship resistance and IBC app correctness. For brevity, we focus on the censorship resistance use case.
A user believes that their tx T is being censored on chain B, and that they are not censored on chain A. Furthermore, assume there is a significant channel C between chain A and chain B, in which chain B would be negatively impacted if it was halted.
To increase the incentive for chain B to include T, the user would like to make an IBC packet P_T from A to B on channel C, which leads to the execution of T on chain B. This action should place a further constraint on the channel, that chain B (by state machine validity) will reject processing any channel C packet created after P_T from chain A, until P_T has been delivered to chain B. So any new packet on this channel is forced to come after P_T. Thus for chain B to continue to censor T, they must also censor every new packet from A to B on C, which by construction has high impact.
Thus chain B is sanctioned such that it is in a position where it must pick between:
- executes tx
T - Delay tx
Tfurther, along with all other interchain communiction from channelC - Commits a state machine validity violation (high cost, not an honest-but-curious fault)
Proposed channel communication semantics
The way this would look like at the communication layer, is we assign every packet a sequence_group. (Or equivalently, we can imagine parsing a sequence number S into (sequence_group, intra_group_sequence), and communicating necessary information to facilitate this). Suppose chain A's sequence_group for channel C is currently G. Then Chain A has the correctness guarantee that every subsequent outbound packet for this channel has sequence group >= G.
Whenever A increments the sequence group from G to G+1, it also communicates an integer k (typically 0 or 1), which sets a validity condition on chain B that "In order to process any tx in group G+1, we must have already processed at least k txs from group G".
We would need to maintain the number of packets processed from the last sequence group. For chain B to count a packet, it must get delivered to chain B (even if timed out), to prove that it got sent.
Both chain A and chain B maintain sequence groups for their inbound and outbound packets, just as they currently do for sequence numbers.
Satisfying the censorship resistance problem
To satisfy the censorship resistance issue, chain A has functionality that allows a user to atomically:
- increment the sequence group to
G+1, withk=0(G+1 has no req'ts on number ofGpackets) - make 1 outbound tx (
P_T), with massive timeout. (years) - increment the sequence group to
G+2, withk=1(G+2txs can only be delivered after 1G+1packet is delivered).
Recall that every new tx created on this channel will be at a group >= G+2.
Thus the set of txs from A to B are:
{
G: "packets created before P_T"
G + 1: "P_T"
G + 2: "packets created after P_T"
}
There are no restrictions on Chain B's ability to process sequence group G txs, just by construction they all predate P_T. And then by the restriction, G+2 can only be processed after the G+1 packet is made, satisfying all of our desired guarantees.
Suggested App details
App considerations, that are not within the realm of the channel layer:
- Chain A should require a notable charge for a user to create a sequence group increment with
k != 1on the primary channel, as this can reduce liveness even when theres no censorship, due to introduction of this blocking behavior. - Chain A should require that the
G+1packet has a notable chain-B acceptable fee attached. (Fee tips mean this is always fixable due to the channel being blocked, but this would reduce adversarial channel blocking). This requires reasoning about chain B's current fee rate, so not really doable until cosmos gets to more complex fee-markets. - Chain B's liveness for the channel requires allowing receiving IBC relays that only contain inbound packets that would time out.
Future discussion, that should be non-blocking
We would like more IBC-apps to opt-in get themselves onto the primary channel that will block / get this censorship resistance guarantee, while maintaining their own total-ordering within their sub-system semantics. For instance, we would want a given ICA account to be able to choose to get multi-plexed onto the main channel, and re-use the weak ordering for achieving its own total-ordering-for-an-account guarantee, while not forcing blocking behavior for other packets.
This is a feature that should not at all block the initial deployment of weak ordering. I view this (and allowing apps to be multi-plexed with their own weak ordering guarantees) as fully achievable via allowing more complex predicates for sequence group validity. The initial proposal is at least k txs included from last group. I believe you can achieve the ICA total-ordering guarantee by relaxing this predicate to Sequence group A requires at least k txs included from sequence group B. So if a particular ICA accounts last message was in group S, and we are currently in group G we would make the next ICA tx by atomically doing:
- increment the current sequence group to
G+1, whose restriction isk=1from sequence group S - Make 1 packet for ICA outbound
- increment the current sequence group to
G+2, with no restriction on prior sequence groups. (k=0)
So there are no constraints with going to G+2 immediately. The only constraint checked here is on packets to G+1, which then depend on the S constraint being satisfied. In general, we can optimize s.t. we only need to check sequence group constraints until they have been satisfied once. This is because we can assume once a sequence group has been satisfied, it remains satisfied forever.
We should be able to as time goes on, be able to protocol-upgrade allow more complex predicates for sequence group validity than at least k txs included from last group .
Aside: Weak ordering use case for IBC app correctness
The IBC app correctness properties are relevant when:
- app on chain A relies on getting communication to chain B in a particular order, over existing channels (necessary if token transfers are involved)
- app on chain A itself requires a weakly ordered communication to chain B.
We can imagine an app that requests oracle updates from a provider, and then wants to make a swap that could only execute after its returned an oracle update. But we don't want any of this communication to block the rest of the communication channels.
Thanks @cwgoes @sunnya97 @AdityaSripal @nicolaslara for all the discussions / ideation leading to this proposal!
Thanks @ValarDragon for the detailed issue writeup!
I want to first double check my understanding of your proposals and then I have some questions on the basis of my understanding.
So it seems to me like the proposed semantics are as follows.
Given a sending sequence at: (G, n)
The next sending sequence may either be: (G, n+1) or (G+1, 1) with k <= n
Given a receiving sequence at (G, n):
It may receive (G, n+1)
Or it may receive the next packet at any previous sequence group:
(G', n'+1) where G' < G and n'+1 is the next sequence for that group.
And of course, a new sequence group (G+1, 1) can only be received if the communicated k values for all prior groups has been received.
i.e. we have at least (G, k_g), (G-1, k_{g-1}), etc.
So it seems to me from the above, that we still have strict total ordering on the sending side. Since a packet cannot be sent from a prior group once the group number is incremented, and packet sequences within a group are sent in order.
What this construction does is slightly relax the receiving side's ordering.
i.e. The packets with sequences (G, n) and (G+1, 1) can be received in either order so long as n > k_g.
Please correct me if my understanding is wrong
Where this helps with censorship resistance is that you can enforce that the receiving chain process your packet before it can process any future packet sent on the channel, but you do not enforce any ordering between your packet and current packets in-flight.
This latter part seems to be the main benefit over using the current ORDERED channel. Can you explain why this benefit is necessary or beneficial enough to warrant a new channel type?
Another point I'd want to clarify. Of course the receiving chain could just write an ACK for the censored packet without processing it. But this would be a state machine violation that would serve as on-chain proof of censorship. I imagine this is what you mean by choosing: Commits a state machine validity violation (high cost, not an honest-but-curious fault)
I guess my main question would be why can't this censorship resistance tactic work just as well with ORDERED channels.
Suppose there is a high-value ICA channel between chainA (controller) and chainB (host).
ChainA could allow a high charge for including a user tx into a packet on the ICA channel. The packet goes in at sequence n, it must now be received before n+1 can be received. But also, it can only be received after any prior in-flight packets are received.
This seems to offer the same censorship resistance principle tho it would require a change to the ica-host module so that it allowed messages from accounts other than the ica account if they have a valid signature.
The only downside is that you have to wait for prior packets to be received, but you still put the same pressure on the censoring chain. Are there other downsides I'm not aware of?
Lastly, I like the idea of offering a new channel ordering but it seems the one you have proposed is too restrictive so as to offer only a slight difference from the current ORDERED channel.
It seems to me that a different primitive would be more useful. We can retain the (sequence_group, intra_group_sequence) concept but relax the ordering on the sending side as well.
Suppose we allowed packets (G, n) to be sent so long as n == getNextSeqSend(portID, channelID, sequence_group) for all sequence groups G.
Similarly, we can receive (G, n) so long as n == getNextSeqRecv(portID, channelID, sequence_group for all sequence groups G.
This effectively multiplexes many independent, strictly ORDERED channels on the same IBC channel that can run independent of each other. This would not solve the censorship resistance issue (i believe that can be solved within a strictly ordered stream of packets). However, it can allow multiple actors to use the same channel and have ordering for their own packet stream without being blocked by anyone else.
E.g. usecase, allowing regular users or smart contracts to create independent ICAs over the same channel
It also solves your oracle use case since the oracle data and swap can be on the same independent stream.
What do you think?
Understanding confirmation:
Yes, sending side is totally ordered (as it is today, even for unordered channels!), but this is merely because IBC today increments sequence numbers to achieve a nonce. We can imagine this being replaced by a randomized nonce, rather than an incrementing sequence, and then were at weak ordering on the send queue as well. (I presume IBC is not designed like this right now for unordered channels, because we don't have effective SDK parallelism)
Receiving side: No the understanding is not right. Namely:
Given a receiving sequence at (G, n):
It may receive (G, n+1)
Or it may receive the next packet at any previous sequence group:
(G', n'+1) where G' < G and n'+1 is the next sequence for that group.
And of course, a new sequence group (G+1, 1) can only be received if the communicated k values for all prior groups has been received.
This should instead be:
Given a receiving sequence at (G, n):
It may receive (G, x), for any x not already delivered
Or it may receive a previously unsent packet at any previous sequence group:
(G', x) where G' < G and x is a sequence for that group that has not already been delivered.
And of course, a new sequence group (G+1, 1) can only be received if the communicated k values for all prior groups has been received.
(Where delivery is checked by packet acks, as it is today)
We recover a totally ordered channel by always incrementing G, and setting k=1. We're at a totally unordered channel if we never increment G (or only increment with k=0)
This implies that intra-sequence order is not respected on the receiving end? Is that correct?
Whenever A increments the sequence group from G to G+1, it also communicates an integer k (typically 0 or 1), which sets a validity condition on chain B that "In order to process any tx in group G+1, we must have already processed at least k txs from group G".
I may be misunderstanding the guarantees here, but shouldn't this be: "In order to process any tx in group G+2, we must have already processed at least k txs from group G+1"? Otherwise chain B can keep censoring the packet sent on G+1 as long as there is at least one other packet in that group
I guess my main question would be why can't this censorship resistance tactic work just as well with ORDERED channels. Suppose there is a high-value ICA channel between chainA (controller) and chainB (host).
I think we can't assume high-value on the ICA channel. Yes, this would give the same censorship-resistance properties for the ICA channel, but the cost of having the ICA channel block is much lower than that of having the "main" channel between A and B block. IIUC, the advantages here come from having most (all?) traffic between through chains flow through the same channel (a channel that can represent both ordered and unordered traffic at the same time)
To increase the incentive for chain B to include T, the user would like to make an IBC packet P_T from A to B on channel C, which leads to the execution of T on chain B. This action should place a further constraint on the channel, that chain B (by state machine validity) will reject processing any channel C packet created after P_T from chain A, until P_T has been delivered to chain B. So any new packet on this channel is forced to come after P_T. Thus for chain B to continue to censor T, they must also censor every new packet from A to B on C, which by construction has high impact.
Landed here again after a while. New thought: Is there a risk here that the data for packet T is no longer available when P_T is submitted? Consider:
- User A sends ibc tx T(G=g) on chain A
- Chain B is censoring user A, so it does not include T
- Time passes (a long time)
- Another user (or user A using an alias) sends a new ibc tx T2(G=g+1)
- Chain B must now include T or block the channel on T2
- What happens if the data for T is no longer available?
I see two potential cases here:
- The data is actually available in some archive node and it just has to be obtained, in which case the channel will be temporarily blocked
- The data is actually not available. Can we recover that channel? How can this be distinguished from censoring?
Great point!
We should always be in the case that:
- The data is actually available in some archive node and it just has to be obtained, in which case the channel will be temporarily blocked
Whats happening if thats not the case? Then that means that chain A is creating blocks without publishing them -- which also just poses risks to IBC. Being able to IBC out of chain A depends on data availability of the IBC subtree. So we already assume this to ever be able to IBC out. What we've done now is add blocking behavior to the "main" channel, whereas this only existed for ICA's thus far, not token sends. However, lack of DA here is still problematic.
To recover the channel even if chain A did lose data seems like a slightly odd design goal -- but we have no strong guarantees outside of the unbonding period. So we could make this logic for requiring inclusion of tx T "expire" after an unbonding period has elapsed. Not sure if its worth the extra code complexity though