optimism icon indicating copy to clipboard operation
optimism copied to clipboard

Interop: Block Builder Denial of Service Protection

Open tynes opened this issue 1 year ago • 5 comments

Since the validity of an interop block depends on the invariant that all executing messages must have valid initiating messages and it requires execution to find the executing messages, this creates a denial of service risk for the sequencer/block builder. The best way to fight denial of service attacks is with horizontal scalability. An approach was proposed in https://github.com/ethereum-optimism/design-docs/pull/7 but it was ultimately decided that we do not want to introduce more microservices into the architecture.

Instead we can take the approach of adding some logic directly into the op-geth mempool. If op-geth is configured with the op-supervisor RPC URL, then the mempool will execute the inbound transaction on the latest state and then check against the supervisor that each executing message has a valid initiating message before adding the transaction to the mempool. This allows a simple cloud architecture where sentry nodes can be deployed in front of the sequencer and provide the denial of service protection.

tynes avatar Jun 19 '24 19:06 tynes

May I work on this issue?

ShubhSensei avatar Jun 20 '24 12:06 ShubhSensei

May I work on this issue?

Sure - can we work on a small design doc here before starting? It should include specifics around the implementation, like an overview of what the diff would look like. Just to make sure we know the full scope of the problem.

tynes avatar Jun 20 '24 16:06 tynes

Cool, will work on it. Can you please share your Telegram ID for smooth discussion?

ShubhSensei avatar Jun 20 '24 20:06 ShubhSensei

Please follow up with a design doc directly in this thread, we are not at a stage where we can provide individual support

tynes avatar Jun 20 '24 21:06 tynes

This is a duplicate of https://github.com/ethereum-optimism/optimism/issues/10890

tynes avatar Sep 17 '24 01:09 tynes

https://github.com/ethereum-optimism/op-geth/pull/422

Here's an implementation which should reject transactions from the mempool as they enter if any "Ingres Filters" should deny the Tx. The one Igress Filter I wrote gets the logs from execution and passes them through the Supervisor.

axelKingsley avatar Oct 30 '24 16:10 axelKingsley

Besides checking ingress, we should re-evaluate a tx that is in the mempool after a while. The block-building code-path will invalidate a tx also, but as verifier node we need to drop these from the mempool more proactively, so the final block-building doesn't get hit with this work.

protolambda avatar Nov 25 '24 15:11 protolambda

While I agree that it would be ideal for the filtering to check frequently, I don't know that prioritizing that additional feature is the right call until we know it's needed. I'd like to come up with a way to determine if this is needed.

There are two ways I can think of that a transaction which previously passed the filter check would change to failing:

  • The chain which holds the initiating event experienced a reorg, so the referenced data is no longer in the expected location.
  • The transaction in question changes which EM it creates based on differences in the EVM state since the last time it was checked.

The reorg case seems very unlikely to be an effective reason to check the filter again, as we would expect these to be very rare, and in that case the block builder can handle those transactions.

The changing-tx case is more valuable, but I still think we shouldn't over-optimize the filter until we see that this sort of thing actually happens. The filter works by simulating each transaction on a throw-away state, which is not trivial (as a block of transactions enter the mempool, the filter effectively does the work of evaluating a whole block, resetting the state between every Tx). To re-evaluate the transactions in the mempool would create an even higher executing pressure.

Summary, I'm not opposed to expanding the filter behavior to include re-evaluating transactions over time, but I want to see that we have a need for it before we add the complexity, considering we have no production data and it would directly multiply the amount of work our Executing Engines are doing.

axelKingsley avatar Nov 25 '24 16:11 axelKingsley

Closing due to the access list design

tynes avatar Mar 18 '25 02:03 tynes