ibc
ibc copied to clipboard
ICS7: Allow consensus state updates with older headers
In the presence of multiple relayers, it can happen that an updateClient transaction fails if its height is smaller than the on chain client state height.
It is possible that there are two relayers scanning for different items, X vs Y, and they require consensus state (CS) updates for same client on destination chain but at different heights. This could result on one of the relayer transaction to fail, as a clientUpdate for height h fails if the latest client height is H >=h.
Some analysis on the relayer logic, given the current ICS07 spec and module implementation, is attempted in https://github.com/informalsystems/ibc-rs/issues/61
This issue is to investigate the possibility of allowing CS updates with heights smaller or equal than the current client state height, and make the required changes in ICS02 (e.g. checkValidityAndUpdateState line 1678) and ICS07 specs, cosmos-SDK and relayer implementations.
I think it should be safe to allow Tendermint clients to be updated from older headers, given that we have the "would-have-been-fooled" misbehaviour checks in place (which we do). Have we specifically checked that doing so is safe from the specification / model checking side though?
"would-have-been-fooled" misbehaviour checks in place
Could you point me to these checks? Just to make sure we are talking about the same thing :)
We are currently building the models and I think we are close to catching the issue described here. Then we are going to modify the handlers to allow older/ smaller heights.
Could you point me to these checks? Just to make sure we are talking about the same thing :)
Aye - see this section in ICS 7. The reason I think this logic is relevant is that without it it might be possible to submit a conflicting update from an old header without being able to freeze the client.
IIUC, this is about evidence handling and I believe it's a good building block for allowing client updates with a lower height header. Let its height be k. Assume {..hi, hj,...,hn} are present and i < k <= j
- if
k == j, the client should run the same logic as for misbehavior and yes it will run againsthi. If there is proof of misbehavior the client is frozen- if
hk == hj(exact same headers) the client will not be frozen (currently the case)
- if
- if
k < jthen the client verification ofhkshould be invoked against hi. Currently it is done againsthn(since we reject thek <= ncase). - the question now is what to do about
hj? Should we verifyhjagainst the newhk? What is the implication if this fails?
Currently there is a valid verification "chain" for the stored client states, every header verifies against the previous one.
the question now is what to do about hj? Should we verify hj against the new hk? What is the implication if this fails?
This is a good question.
I think it depends on the properties of the bisection algorithm, namely whether it is ever possible for bisection starting from a later header to fail while bisection starting from an earlier header succeeds. I think that this is possible ince validator set updates could have been made after the first header which cause more than a 1/3 divergence with the latest header (which are subsequently "reverted" in time for the final header); however, this does not necessarily imply misbehaviour.
English alone might be confusing, so to phrase it in terms of heights & validator sets:
- Let the validator set at
h_ibev_i,h_jbev_j,h_kbev_k,i < k <= j - The light client has stored
h_iandh_j, and is now validatingh_k - Suppose that:
v_iandv_jhave more than 2/3 overlap (less than 1/3 difference) (this is necessarily true since the update succeeded previously)v_iandv_khave more than 2/3 overlap (less than 1/3 difference) (this is required for the update fromh_itoh_kto succeed)v_kandv_jhave less than 2/3 overlap (more than 1/3 difference) (this is possible, we are just dealing with coincidence of sets, sov_kandv_jmay have as little as 1/3 overlap - although they must have that much by the above two constraints)
Then the update from v_k to v_j will fail (without additional intermediary headers), even though no misbehaviour has (necessarily) occurred. It should certainly be the case that there exist intermediary headers such that if they were also submitted the update would succeed, however.
Is my understanding correct here? cc @ebuchman @melekes as well.
Suppose that: v_i and v_j have more than 2/3 overlap (less than 1/3 difference) (this is necessarily true since the update succeeded previously) v_i and v_k have more than 2/3 overlap (less than 1/3 difference) (this is required for the update from h_i to h_k to succeed) v_k and v_j have less than 2/3 overlap (more than 1/3 difference) (this is possible, we are just dealing with coincidence of sets, so v_k and v_j may have as little as 1/3 overlap - although they must have that much by the above two constraints)
I believe the default overlap between trusted validator set and validator set that signed a header is > 1/3rd. https://github.com/tendermint/tendermint/blob/d1d33057dcbb52c3ef6cecaff0808d65946cbff3/lite2/verifier.go#L16 Current misbehavior check also assumes > 1/3rd trust level and a client will be frozen although there is no guarantee of identifying punishable behavior (i.e. two headers at same height may have disjoint signing validator sets). I think we need to change the default trust level to 2/3rd, @ebuchman @melekes what do you think?
Then the update from v_k to v_j will fail (without additional intermediary headers), even though no misbehaviour has (necessarily) occurred. It should certainly be the case that there exist intermediary headers such that if they were also submitted the update would succeed, however.
This is correct and in this case the client will not be frozen but updateClient will fail. Now I am wondering on the usefulness of allowing older headers from the relayer perspective. The relayer can perform the checks against v_i but not against v_j (as it is not aware of it at the time it creates the updateClient). And we are back to the situation where the update with h_k will fail. At this point I believe we can leave things the way they are as the mitigation is the same, relayer gets notify of a client update at h_j and can redo the datagrams taking the proofs at height j-1.
cc @milosevic
I believe the default overlap between trusted validator set and validator set that signed a header is > 1/3rd.
Ah yes, you are right, my mistake. The reasoning still holds, anyways. We certainly want a guarantee of identifying punishable behaviour; however, as I recall, we had reasoned through this at one point & thought we had one with > 1/3 (with additional slashing for amnesia & counterfactual signing). Has that reasoning changed?
Now I am wondering on the usefulness of allowing older headers from the relayer perspective.
It's worth noting that the situation I mention seems unlikely to be frequent in practice, it's just possible in theory - and it's also not really possible to exploit via DoS, since it depends on the changes in the validator set.
however, as I recall, we had reasoned through this at one point & thought we had one with > 1/3 (with additional slashing for amnesia & counterfactual signing). Has that reasoning changed?
That is what I recall but right now I can't seem to make this work in my mind, hope @milosevic will help :)
It's worth noting that the situation I mention seems unlikely to be frequent in practice, it's just possible in theory - and it's also not really possible to exploit via DoS, since it depends on the changes in the validator set.
Agreed but we still have to think this through and code it in.
One last case:
Assume: h'_n, h_n - headers at height n.
There is still the case for the updateClient(h'_n) with an h_n already stored as latest header. Should the IBC client handler check if h'_n != h_n and freeze the client in case of misbehavior?
Also the relayer would benefit from client freezing notifications.
Should the IBC client handler check if h'_n != h_n and freeze the client in case of misbehavior?
This should already be supported by "would-have-been-fooled", since there should be some past h_m with m < n such that h'_n and h_n where h'_n /= h_n is detected as misbehaviour.
We need to put the reasoning in the context of the light client security model. In case assumptions hold (there are no forks), allowing smaller heights to be installed should not create any risks (if we are within trusted period) as we are still within light client security model. So the example where inserting smaller header leads to break of the transitivity chain, shouldn't be possible under light client security model. Only in case there is a fork (attack) this can happen. But in case of fork, faulty header can also come in the current implementation. So I don't see anything fundamentally changing there. My understanding is that we just need to be explicit about reference points when we install headers: this allows us to be more flexible and support installing smaller headers, and keeping the same security guarantees.
v_iandv_jhave more than 2/3 overlap (less than 1/3 difference) (this is necessarily true since the update succeeded previously)
In principle the reasoning is OK. But to be precise, what should be said here, I think, is that v_j is the commit for the block j, and v_i is the NextValidator set of block i. As a result, v_k once should be a commit and once a next validator set in the example...
However, the argument of @cwgoes is right, of course. A simpler example is that at even heights the nextvalidator set could be E and at odd heights it could be O, with E and O having an empty intersection. Since these sets alternate, whether bisection is successful without intermediate headers depends whether you check, e.g., odd header agains odd header, etc.
I would just note that https://godoc.org/github.com/tendermint/tendermint/lite2 switches to sequential verification if h_k == h_j - 1. If not, there have to be some intermediate headers between h_k and h_j for verification to succeed.
We certainly want a guarantee of identifying punishable behaviour; however, as I recall, we had reasoned through this at one point & thought we had one with > 1/3 (with additional slashing for amnesia & counterfactual signing). Has that reasoning changed?
Not sure, but maybe this is in reference to Lunatic Validators, which in the past at h_i were >1/3 of the val set, at current height h_j in block b_j they're < 1/3 of the val set, and they sign an invalid block b_j' where they're >2/3 of the val set. This could fool a light client without equivocation/amnesia/counterfactual since they had +1/3 at h_i and less than 1/3 now, but it's prevented by punishing validators for signing invalid state (which if I'm not mistaken still needs to be implemented in tendermint). Some recent discussion about this here https://github.com/tendermint/tendermint/issues/4694#issuecomment-639897804 (lunatic is F5). I think it might also be prevented by a +2/3 trust threshold but need to think about that more.
I think it depends on the properties of the bisection algorithm, namely whether it is ever possible for bisection starting from a later header to fail while bisection starting from an earlier header succeeds.
Josef's example makes it clear that bisection does not preserve this if there is no limit on validator turnover, ie. it is possible for bisection starting from a later header to fail while bisection starting from an earlier header succeeds. Is that a problem? Anyone trying to justify the latest header can always check prior headers if it can't be verified against the second latest. Just need to fully think through the misbehaviour detection.
Also can't this happen from normal operation anyways even without the relayer race? It's possible the shortest bisection trace to justify a new header requires a header which is not in the store and which is older than the latesst one. Is that already supported? Is it different from this?
At this point I believe we can leave things the way they are
Right, it's not clear how bad this is for relayers and if I'm not mistaken it can all be made not expensive (ie. the relayer wouldn't be charged the tx fee), so it's not clear to me if it's really necessary trying to do this, though I don't know if my previous point requires us to support it anyways.
Also can't this happen from normal operation anyways even without the relayer race? It's possible the shortest bisection trace to justify a new header requires a header which is not in the store and which is older than the latest one. Is that already supported? Is it different from this?
At present, this is not supported, though I think we could support it safely. It does seem unlikely to arise in practice, though (there always would be a valid transition from the latest header, even if it weren't the cheapest bisection).
Closing as this has already been added to the spec and implemented in ibc-go.