feat(dfir_rs): track and expose DFIR runtime metrics
Tokio runtime metrics: https://docs.rs/tokio/latest/tokio/runtime/struct.RuntimeMetrics.html
Per-subgraph metrics:
total_run_count- how many times the subgraph has been runtotal_poll_duration- amount of time the subgraph is runningtotal_poll_count- how many times the subgraph is polledtotal_idle_duration- amount of time the subgraph is "idle" (not running and not complete)total_idle_count- how many times the subgraph is idle
Per-handoff metrics:
total_items_count- Total items inserted into this handoff.- ?
current_items_count- Number of items currently in the handoff
My comments from tracking issue #2178
Looking into how best to instrument execution of subgraphs.
tokio-metricsprovides comprehensive general instrumentation of async tasks, but testing using it in DFIR shows 15% longer runtime on some microbenchmarks.tokio-metricsrecords more properties than we care about [right now], and uses atomics andArcto support multiple threads while we could track things directly inSubgraphData. Based on this I think it is best to implement custom instrumentation of subgraphs in DFIR withtokio-metricsas inspiration.
I think we can start with a few important metrics per subgraph execution:
total_poll_duration- amount of time the subgraph is runningtotal_poll_count- how many times the subgraph is polledtotal_idle_duration- amount of time the subgraph is "idle" (not running and not complete)total_idle_count- how many times the subgraph is idle
Some Dfir-runtime level metrics would be external events received