interchain-security icon indicating copy to clipboard operation
interchain-security copied to clipboard

Consumer client created with initial consensus state with wrong timestamp

Open ancazamfir opened this issue 2 years ago • 13 comments

Problem

[cc: @MSalopek , thanks for helping with this!] We were debugging an issue with expired consumer client while setting up a connection using hermes CLI. The handshake was failing with this error message:

failed building header with error: light client verification error for chain id consumer-1: trusted state outside of trusting period

It indicated that the consumer client on the provider chain could not be updated because it was expired. The relevant proposal and genesis parameters that were used are:

"spawn_time": "2023-01-26T15:00:00.000000Z", and
"genesis_time": "2023-01-24T21:07:05.657521033Z",

And the client parameters are:

  trusting_period: 57024s --> ~16hrs
  unbonding_period: 86400s --> 24hrs

Here is the summary of the findings: The consumer client on the provider is created here: https://github.com/cosmos/interchain-security/blob/main/x/ccv/provider/keeper/proposal.go#L87-L94

Not clear where all the params are taken from but from the client state output, it seems that it is created with a consensus state at height 1 but a timestamp ( after spawn_time?) that it is not the one of the block at height 1 on the consumer chain (genesis_time).

When hermes tries to update a client, it first checks that the client state is not expired. For this it gets the timestamp of the latest consensus state (height 1 in our case) from provider chain (from the IBC client module store), i.e. the one the client was created with. As mentioned above this timestamp is different than the timestamp at height 1 as seen on the consumer chain. So this check passes on hermes. Then hermes tries to verify the new header (let’s say height 100). The error we see comes from this step.

This check is part of the light client code that is in tendermint-rs repository. The API takes a trusted height (1) and the target height (100). It then fetches the blocks from the consumer chain. First it checks that the block at trusted height (1) is still recent enough (not older than the trusting period). This part of the code generates the error message that is seen above.

Closing criteria

Ideally the client created on the provider should be done with a consensus state with same timestamp as the corresponding block on the consumer chain (genesis_time for height 1). In addition, something should ensure that spawn_time - genesis_time < trusting period and the consumer client on the provider is updated within trusting_period of the genesis_time. This should be clearly documented.

Problem details

Please describe the problem in all detail.

TODOs

  • [ ] Labels have been added for issue
  • [ ] Issue has been added to the ICS project

ancazamfir avatar Jan 31 '23 11:01 ancazamfir

Thank you for investigating and opening this issue, this is great feedback!

To mitigate this issue in running testnets we can instruct participants to always set their genesis time so it walls within the spawn_time - genesis_time < trusting period window.

MSalopek avatar Jan 31 '23 11:01 MSalopek

Ideally the client created on the provider should be done with a consensus state with same timestamp as the corresponding block on the consumer chain (genesis_time for height 1).

Not sure that this is possible as the provider doesn't know when the consumer will start. I think the best we can do it to clearly document this.

mpoke avatar Jan 31 '23 18:01 mpoke

Not sure if the genesis and proposal used in the scenario are typical. But if initial_height is (always??) 1 then the time should be the consumer's genesis_time. And could be included in the proposal? But I agree, first step should be documentation.

ancazamfir avatar Jan 31 '23 18:01 ancazamfir

I'm not that familiar with bootstrapping chains. Is there a way to set a chain's genesis time in advance, regardless of when the chain actually starts?

mpoke avatar Jan 31 '23 18:01 mpoke

Yes, it is part of genesis.json file. The first lines are something like (see the example used in the test):

{
  "genesis_time": "2023-01-24T21:07:05.657521033Z",
  "chain_id": "consumer-1",
  "initial_height": "1",
...

The genesis_time is the timestamp of block at initial_height.

The proposal (example from test) includes an initial height but not a time:

    "initial_height": {
        "revision_height": 1,
        "revision_number": 1
    },

It does have however a spawn_time, not sure what it is used for other than the timestamp of the consensus state with which the client is created (still need to confirm this, it is not obvious from the first look at the code). So maybe matching spawn_time in the proposal to genesis_time in the consumer consensus would work (??). This would be similar to the fact that it seems to be required that the revision_height shall be the same in the two files, 1 in this example.

There is also a question around revision_number. In IBC we derive the revision_number from the chain_id. If it has the format <name>-n then revision_number = n. I am wondering what happens here if initial_height.revision_number in the proposal is set to something else, let's say 2. And then the same question about historical_entries, what happens if they are different in the proposal vs consumer genesis.

Haven't looked at the other parameters in the proposal (on my todo, together with a review of the code).

ancazamfir avatar Jan 31 '23 21:01 ancazamfir

It does have however a spawn_time, not sure what it is used for other than the timestamp of the consensus state with which the client is created (still need to confirm this, it is not obvious from the first look at the code). So maybe matching spawn_time in the proposal to genesis_time in the consumer consensus would work (??).

Looked at the code more. Seems that the timestamp of the consensus state used when client is created is the provider chain time when proposal is executed, i.e. neither spawn_time nor genesis_time in our test).

High level, iiuc, when the proposal is executed, in addition to creating the consumer client, the consumer genesis parameters are determined. These can then be exported and some tool might be available to generate the consumer chain genesis file (Is there one?). Then validators are expected to start the nodes with this, ideally correct, consumer genesis file.

Could this maybe work?

  • when the consumer client is created on the provider, it should have an initial consensus state with time spawn_time (here are the diffs to illustrate, also some changes for initial height, see below)
  • spawn_time should become genesis_time in the consumer genesis file (ensured by generation tools if they exist and also documented)
  • i don't think initial_height in the proposal should include revision_number as this should be derived from chain_id (see same diffs). Didn't make full changes but the proposal could just include:
    "chain_id": "consumer-1",
    "initial_height": 1
    ...

Note: I made other changes so the tests pass, however the ones related to how the revision number is determined are not perfect as the tests create a chain with revision number 0 (extracted from chain with ID ChainID) but expect headers with different revision numbers. I couldn't find a way to properly pass a chain ID (e.g. ChainID-2) to the test setup.

ancazamfir avatar Feb 01 '23 10:02 ancazamfir

The current implementation guarantees the following invariant:

spawn_time <= consumer_client_creation_time <= genesis_time 

The spawn_time in the proposal can be in the past, e.g., 1 year ago, which means that gov proposal would be handled as soon as governance passes. That's why we use consumer_client_creation_time instead. We could set genesis_time to consumer_client_creation_time. However, this is part of the consumer genesis file, which the provider doesn't have control of. The best way to deal with this is to make consumer_client_creation_time queryable so that it can be added to the genesis file.

mpoke avatar Feb 01 '23 14:02 mpoke

i don't think initial_height in the proposal should include revision_number as this should be derived from chain_id (see same diffs).

I agree. That would simplify the UX.

mpoke avatar Feb 01 '23 14:02 mpoke

The current implementation guarantees the following invariant:

spawn_time <= consumer_client_creation_time <= genesis_time 

The spawn_time in the proposal can be in the past, e.g., 1 year ago, which means that gov proposal would be handled as soon as governance passes. That's why we use consumer_client_creation_time instead. We could set genesis_time to consumer_client_creation_time. However, this is part of the consumer genesis file, which the provider doesn't have control of. The best way to deal with this is to make consumer_client_creation_time queryable so that it can be added to the genesis file.

This would definitely work. But where is consumer_client_creation_time defined, I don't see it.

ancazamfir avatar Feb 02 '23 11:02 ancazamfir

But where is consumer_client_creation_time defined, I don't see it.

It's the block time on the provider when the proposal is handled, see https://github.com/cosmos/interchain-security/blob/b3533f63cf7be2df1bd3b5279d7914a72f1e4de9/x/ccv/provider/keeper/proposal.go#L89

mpoke avatar Feb 02 '23 11:02 mpoke

Ah, I saw that one. But thought it is explicitly defined someplace. So yes as you siad, exporting that time for use in consumer genesis would work.

ancazamfir avatar Feb 02 '23 12:02 ancazamfir

Another question is about upgrade from sovereign to consumer. Will the new consumer client be also created during proposal execution? Probably to be clarified or addressed elsewhere but was curious about it :)

ancazamfir avatar Feb 02 '23 12:02 ancazamfir

Another question is about upgrade from sovereign to consumer. Will the new consumer client be also created during proposal execution? Probably to be clarified or addressed elsewhere but was curious about it :)

Sorry for the late reply. For sovereign to consumer, the CCV channel will be built on already existing clients. Actually, my preference would be to build on top of an already existing connection as the two chains most likely are already connected via IBC.

mpoke avatar Feb 23 '23 13:02 mpoke

Closing as https://github.com/cosmos/interchain-security/issues/2280 better explains the change needed.

mpoke avatar Sep 17 '24 10:09 mpoke