risingwave
risingwave copied to clipboard
Tracking: Visualize stream graph bottleneck
Tracking for: https://github.com/risingwavelabs/risingwave/issues/13481
- [x] Fix the output blocking ratio metrics. https://github.com/risingwavelabs/risingwave/pull/18219
- https://github.com/risingwavelabs/risingwave/blob/4d0a2010145417f1a2e0766b39e98dbe55c98bbe/src/meta/src/dashboard/prometheus.rs#L150
- https://github.com/risingwavelabs/risingwave/blob/4d0a2010145417f1a2e0766b39e98dbe55c98bbe/src/stream/src/executor/dispatch.rs#L126-L130
- Perhaps we can consider the "no metrics" case as though blocking is 100%.
- We must be careful to consider the case where there's just not much data going through the graph.
- [ ] https://github.com/risingwavelabs/risingwave/issues/17510 https://github.com/risingwavelabs/risingwave/pull/18272
- [ ] Support DDL level graph, with output blocking ratio. This is basically the same as fragment graph. We can calculate output blocking ratio by just looking at the metrics for the dispatcher side between MVs.
- [ ] Support DDL level graph, per schema, with output blocking ratio. This is basically the same as fragment graph. We can calculate output blocking ratio by just looking at the metrics for the dispatcher side between MVs.
- [ ] Support throughput in stream graph. We can just track the max throughput seen so far, and let that be the max, then throughout the whole graph scale color codes between edges according to the max throughput.
- [ ] Support moving from an MV into its fragment graph.
- [ ] Verify it works for join amplification.
- [ ] Verify it works for dirty agg group scenario.
- [ ] Migrate the cloud dashboard.
- [ ] Prometheus data source.