risingwave icon indicating copy to clipboard operation
risingwave copied to clipboard

Tracking: Visualize stream graph bottleneck

Open kwannoel opened this issue 6 months ago • 0 comments

Tracking for: https://github.com/risingwavelabs/risingwave/issues/13481

  • [x] Fix the output blocking ratio metrics. https://github.com/risingwavelabs/risingwave/pull/18219
    • https://github.com/risingwavelabs/risingwave/blob/4d0a2010145417f1a2e0766b39e98dbe55c98bbe/src/meta/src/dashboard/prometheus.rs#L150
    • https://github.com/risingwavelabs/risingwave/blob/4d0a2010145417f1a2e0766b39e98dbe55c98bbe/src/stream/src/executor/dispatch.rs#L126-L130
    • Perhaps we can consider the "no metrics" case as though blocking is 100%.
    • We must be careful to consider the case where there's just not much data going through the graph.
  • [ ] https://github.com/risingwavelabs/risingwave/issues/17510 https://github.com/risingwavelabs/risingwave/pull/18272
  • [ ] Support DDL level graph, with output blocking ratio. This is basically the same as fragment graph. We can calculate output blocking ratio by just looking at the metrics for the dispatcher side between MVs.
  • [ ] Support DDL level graph, per schema, with output blocking ratio. This is basically the same as fragment graph. We can calculate output blocking ratio by just looking at the metrics for the dispatcher side between MVs.
  • [ ] Support throughput in stream graph. We can just track the max throughput seen so far, and let that be the max, then throughout the whole graph scale color codes between edges according to the max throughput.
  • [ ] Support moving from an MV into its fragment graph.
  • [ ] Verify it works for join amplification.
  • [ ] Verify it works for dirty agg group scenario.
  • [ ] Migrate the cloud dashboard.
  • [ ] Prometheus data source.

kwannoel avatar Aug 22 '24 06:08 kwannoel