go-libp2p icon indicating copy to clipboard operation
go-libp2p copied to clipboard

Catching panics

Open Stebalien opened this issue 3 years ago • 1 comments

(moving a discussion from a private conversation to somewhere more public)

Libp2p performs quite a bit of complex parsing, which has occasionally lead to panics at runtime. When uncaught, these panics crash the entire node.

Proposal: Catch panics at "failure boundaries". E.g.:

  • If we have some form of "connection" worker, catch panics in the worker and kill the entire connection if the worker panics. Same for streams.
  • Catch panics in per-peer stream handlers, cleaning up all state related to the peer.
  • Catch panics in low-level parsing logic. Parsing tends to be pretty self-contained but also pretty error prone.
  • Stretch: Where possible, catch service-level panics, cancel all current requests, close all resources, and restart. But we do need to be a bit careful to not continue running in a corrupted state.

Stebalien avatar Apr 11 '22 11:04 Stebalien

  • https://github.com/libp2p/go-libp2p/pull/1376
  • https://github.com/libp2p/go-libp2p-transport-upgrader/pull/107

Stebalien avatar Apr 11 '22 11:04 Stebalien