Better recovery from `preconfBlock` timeout

Open linoscope opened this issue 7 months ago • 2 comments

Right now, when we get Failed to call driver RPC for API 'preconfBlocks' within the duration (1000ms), we remove the block that failed to preconfBlock like this:

        match self
            .taiko
            .advance_head_to_new_l2_block(
                l2_block,
                anchor_block_id,
                l2_slot_info,
                end_of_sequencing,
            )
            .await
        {
            Ok(preconfed_block) => Ok(preconfed_block),
            Err(err) => {
                error!("Failed to advance head to new L2 block: {}", err);
                self.remove_last_l2_block().await?;
                Ok(None)
            }
        }

However, this sometimes causes the node to be stuck as the preconfBlock may actually be reflected in taiko geth, but removed from side car. This causes the side car to be stuck with highestUnsafeL2PayloadBlockID: 407, different from Taiko Geth Height: 408.

We should handle recovery from such time outs in a better way.

May 26 '25 10:05 linoscope