optimism icon indicating copy to clipboard operation
optimism copied to clipboard

RunNextSequencerAction considers everything an unclassified error

Open axelKingsley opened this issue 1 year ago • 2 comments

While working on our recent Mainnet Unsafe Head Stall debug, I noticed that I could not find any logs for "sequencer failed temporarily" nor "requiring derivation reset"

https://optimistic.grafana.net/goto/X5rbj92IR?orgId=1 https://optimistic.grafana.net/goto/_G5JCr2Sg?orgId=1

I believe it's because the emitted error looks like:

t=2024-02-15T04:41:17+0000 lvl=eror msg="sequencer failed to seal block with unclassified error"
err="failed to complete building block: error (1): 
failed to complete building on top of L2 chain 0x7fd0c053bf2ee546ce05f9bd1bb7fb30dbcab8707009aa34fa598cdb7b6fed9b:116186647, id: 0xc4582b2f37b293ef, error (1): 
failed to insert execution payload: 
failed to execute payload: 
context deadline exceeded"

(Newlines added for clarity)

and the TemporaryErr status is encoded on line two as error (1), which is due to

return nil, errTyp, fmt.Errorf("failed to complete building on top of L2 chain %s, id: %s, error (%d): %w", e.buildingOnto, e.buildingInfo.ID, errTyp, err)

https://github.com/ethereum-optimism/optimism/blob/08921d5c1df39c329f9ab82447ea939bc61a8784/op-node/rollup/derive/engine_controller.go#L196

And the error check looks like

} else if errors.Is(err, derive.ErrTemporary) {

https://github.com/ethereum-optimism/optimism/blob/7ec23bb8583c1042aa88046244b83b85d7de5738/op-node/rollup/driver/sequencer.go#L227

I'm not an expert on errors.Is, but I don't expect it'll figure out that (1) means Temporary error when it's so deeply embedded in all these other errors.

axelKingsley avatar Feb 16 '24 23:02 axelKingsley

errors.Is should walk the entire chain of errors to find any that match. So long as the parent error is always added with %w (not %v) it should work fine.

ajsutton avatar Feb 18 '24 21:02 ajsutton

@axelKingsley do we plan to pick this one up in the near future or we can close it as a s stale issue for now?

BlocksOnAChain avatar Jun 21 '24 09:06 BlocksOnAChain

This was refactored a bunch during the events changes in May/June last year. This issue seems stale now. Closing.

protolambda avatar Feb 04 '25 13:02 protolambda