sui icon indicating copy to clipboard operation
sui copied to clipboard

Connect the dots between Sui and Narwhal for each message

Open lxfind opened this issue 2 years ago • 4 comments

For every message (either shared object transaction or checkpoint fragment) that Sui sends to NW and sent back after sequenced, we should find a way to log them in a way such that we can track the lifecycle of each message in the entire process: i.e. for each message, we get to know when it's sent to NW, when it's processed/sequenced, when it's sent to each validator and when each validator receives it.

lxfind avatar Aug 16 '22 15:08 lxfind

This sounds a lot like a tracing span, FWIW.

huitseeker avatar Aug 16 '22 22:08 huitseeker

I don't know if that works across different stack traces (e.g. NW processes messages from a pool instead of through a direct function call). How do you keep that same trace alive in the message?

lxfind avatar Aug 16 '22 23:08 lxfind

The use case is indeed what "tracing" solutions address, but for Rust tracing I didn't find a way to use the span tracer on heap without guard, or connect spans via IDs. I have seen these tracing features used to trace across work pools or batching logic. Interested in hearing what strategy we choose here in the short term.

mwtian avatar Aug 17 '22 16:08 mwtian

This is what I ended up doing: https://github.com/MystenLabs/sui/pull/4427 https://github.com/MystenLabs/sui/pull/4442 https://github.com/MystenLabs/narwhal/pull/882

I couldn't find a way to use tracing span for this. In particular, when a Sui transaction is sent to Narwhal, it immediately returns and only adds the transaction to the batch. Stacktrace based tracing doesn't seem to be workable in this case. Does tracing supports some kind of dynamic tracing?

lxfind avatar Sep 06 '22 16:09 lxfind