specs icon indicating copy to clipboard operation
specs copied to clipboard

Interop Timestamp Invariant - Intermittent Round-trip Denial

Open clabby opened this issue 10 months ago • 4 comments

Overview

The current Interop Messaging Spec's timestamp invariant states that:

The timestamp at the time of inclusion of the initiating message MUST be less than or equal to the timestamp of the executing message as well as greater than or equal to the Interop Start Timestamp.

When the dependency set of the superchain contains chains with varying block times, this can create interesting situations where, within some superchain snapshots (but not all,) round-trips between chains are not allowed.

Superchain STF

The superchain STF is defined as the transition from one SuperRoot to the next SuperRoot, every one second. In this transition, chains within the dependency set may or may not have actually progressed, depending on their blocktime.

Image

Example Case

For this example, let's consider there's two chains in the dependency set

  • Chain A (1s blocktime)
  • Chain B (2s blocktime)

At t: 3, for example, Chain A's block timestamp in the superchain snapshot would be 3, but Chain B's would be 2. This means that with the current timestamp invariant (initiating message timestamp MUST be <= executing message timestamp), an actor would not be allowed to relay an initiating message from Chain A block # 3 within Chain B's block # 1. However, in the very next superchain snapshot @ t: 4, the timestamps of the blocks within the snapshot will have equivocated, and round-trip messages will once again be allowed.

This can create a non-trivial challenge for relaying applications in a post-interop world, where landing intra-block message bundles across multiple chains becomes a more difficult task (requiring targeting specific superchain snapshots, and growing in complexity with the more varied block times in the dependency set.) In the eyes of a user, sometimes their bundles will fail, and sometimes they will succeed.

Suggestion

To allow for round-trip messaging within any superchain snapshot, we should find a way to lift the timestamp invariant into a non-local check. A possible route is changing the invariant to:

The timestamp at the time of inclusion of the initiating message MUST be less than or equal to the largest timestamp observed within the superchain snapshot as well as greater than or equal to the Interop Start Timestamp.

But this has some downsides, requiring more data to be fed into the verification function for individual message ends. Open to discussion in this issue on other methods to loosen this check.

clabby avatar Feb 12 '25 16:02 clabby

I believe the Fault-Proof super-root snapshot is not ideal for reasoning about this invariant, since the snapshot composition depends on what was last cross-safe and what optimistic block directly follows.

If one part of the snapshot is 10 minutes behind, e.g. because a L2 hasn't batch-submitted for a while, that shouldn't mean that the next optimistic block is allowed to include data from the more recent entries in the snapshot (assuming "horizon" is defined as the maximum timestamp), nor should it mean the opposite of chains that frequently batch-submit cannot include things because other chains are not batch-submitting.

The timestamp invariant should be independent of the snapshot construction.

I do agree with the problem of not having round-trips between unaligned block-timestamps or different block time increments though, that we should fix.

Say for example, we have two chains:

  • Chain A with 1 second blocks
  • Chain B with 2 second blocks
  • block a0 with timestamp X
  • block b0 with timestamp X
  • block a1 with timestamp X+1
  • block b1 with timestamp X+2
  • block a2 with timestamp X+2
  • block b2 with timestamp X+4
  • block a3 with timestamp X+3

Then we want a1 and b1 to be able to round-trip messages, because a1 cannot roundtrip with b0 (because b0 is already sealed and published), and we don't want to hold back on cross-rollup comms.

Also with flashblocks, smaller increments of chains should be able to communicate. E.g. a0_flash3/8 and b0_flash6/16 should both be equal at the same millisecond level, so should be able to interop. In the protocol we don't have to enshrine the flashblock boundaries, since we can check the relative relationships between messages are directional, but do need to check some invariants to limit the scope of messages to verify.

Does this match your understanding of the problem?

I think we can change the timestamp invariant to be more relaxed, based on the next expected block time, to fix this.

protolambda avatar Feb 12 '25 17:02 protolambda

The timestamp invariant should be independent of the snapshot construction.

I buy this. In the proof, we have context of the snapshot construction, but I can see how leaking that into the check creates issues for the simplicity of an implementation in the context of the supervisor.

In the protocol we don't have to enshrine the flashblock boundaries, since we can check the relative relationships between messages are directional, but do need to check some invariants to limit the scope of messages to verify.

Does this match your understanding of the problem?

Yep, exactly.

I think we can change the timestamp invariant to be more relaxed, based on the next expected block time, to fix this.

The next expected block timestamp of which chain, though (initiating, executing, or a mixture)? In your example, a2 and b2 wouldn't be able to round-trip with some interpretations of this rule. This also gets a bit more complex to think about as we expand the dependency set, with more than just 2 differing blocktimes among the set of chains.

w/ existing rule (initiating_timestamp <= executing_timestamp):

  • a2 (initiating) -> b2 (executing) ✅ (X+2 <= X+4)
  • b2 (initiating) -> a2 (executing) ❌ (X+4 > X+2)

w/ "next expected time" rule (initiating_timestamp <= next_expected_time(executing_timestamp, executing_chain_id)):

  • a2 (initiating) -> b2 (executing) ✅ (X+2 <= X+4+2)
  • b2 (initiating) -> a2 (executing) ❌ (X+4 > X+2+1)

w/ "next expected time" rule (next_expected_time(initiating_timestamp, initiating_chain_id) <= executing_timestamp):

  • a2 (initiating) -> b2 (executing) ✅ (X+2+1 <= X+4)
  • b2 (initiating) -> a2 (executing) ❌ (X+4+2 > X+2)

w/ "next expected time" rule (initiating_timestamp <= next_expected_time(executing_timestamp, initiating_chain_id)):

Note: Adds the initiating chain's blocktime to the executing message's timestamp.

  • a2 (initiating) -> b2 (executing) ✅ (X+2 <= X+4+1)
  • b2 (initiating) -> a2 (executing) ✅ (X+4 <= X+2+2)

w/ "next expected time" rule (next_expected_time(initiating_timestamp, executing_chain_id) <= executing_timestamp):

Note: Adds the executing chain's blocktime to the initiating message's timestamp.

  • a2 (initiating) -> b2 (executing) ✅ (X+2+2 <= X+4)
  • b2 (initiating) -> a2 (executing) ❌ (X+4+1 > X+2)

clabby avatar Feb 12 '25 17:02 clabby

I think the right invariant to use may be: initiating_msg.emitted_in.parent.timestamp < executing_msg.included_in.timestamp

This basically changes the a <= b to a a -1 < b, but in terms of blocks instead of timestamps, to accommodate the different block times.

E.g. if chain A has block time 4, and chain B 1, then 4 blocks of B can all interact with 1 block of A. After passing the point where both chains are sealed at the same time.

a2 and b2 shouldn't be able to round-trip. The blocks a2 and b1 are sealed at the same wallclock time, those should round-trip instead. The a2 and b2 block-numbers being the same is just unfortunate coincidence.

b2 should still be able to round-trip with chain A. It should do so with the next block after this equal-timestamp roundtrip of b1 and a2. So that would be a3.

With the chain data from my previous comment: valid:

  • a0 (init) -> b0 (exec) X-1 < X
  • b0 (init) -> a0 (exec) X-2 < X
  • a0 (init) -> b1 (exec) X-1 < X+2
  • a1 (init) -> b1 (exec) X < X+2.
  • b1 (init) -> a1 (exec) X < X+1
  • a2 (init) -> b1 (exec) X+1 < X+2
  • b1 (init) -> a2 (exec) X < X+2
  • a3 (init) -> b2 (exec) X+2 < X+4
  • b2 (init) -> a3 (exec) X+2 < X+3 invalid as desired:
  • a1 (init) -> a0 (exec) X !< X
  • b1 (init) -> a0 (exec) X !< X
  • b2 (init) -> a2 (exec) X+2 !< X+2 (i.e. need a2 is sealed at same time as b1, a3 can be used for roundtrip with b2)

protolambda avatar Feb 13 '25 13:02 protolambda

If one part of the snapshot is 10 minutes behind, e.g. because a L2 hasn't batch-submitted for a while

This can't happen. A super root must include the highest possible block at or before the super root's timestamp. So if you have a combination of 1 and 2 second block times, there can be at most 1s difference between any of the output roots in the super root and every second super root would have all output roots from the same timestamp.

Importantly, the blocks in the super root are the chain head for fault proofs. So at timestamp X+1, we will have a super root containing (a1, b0). If a1 references anything in b1 it will have to be found to be invalid by the proof system because b1 doesn't exist yet and it may not even have batch data available on L1 yet.

I struggle to see how it could ever be safe to have executing messages that reference a block with a later timestamp because the world view of proofs is limited by a timestamp and nothing after that proposal timestamp exists.

ajsutton avatar Feb 14 '25 03:02 ajsutton