kubedl icon indicating copy to clipboard operation
kubedl copied to clipboard

[ASoC 2022] Metrics visualization and health scoring model for job

Open hoaresky opened this issue 3 years ago • 0 comments

Background

For now, KubeDL dashboard supports displaying basic informations such as jobs, logs and events, and users are able to manipulate objects through some build-in buttons. However, dashboard can help users digging more insights with visualization of core metrics such as resources utilization, I/O tracing. Usually, system metrics will be collected and gathered in Prometheus protocol, which is a good entry point.

Goals to be achieved

  1. Implement data/metrics visualization leveraging prometheus.
  2. Based on the job information and data metrics, design a job health model to quantify degree of job runtime healthiness.

Additional context

This issue is part of our https://github.com/kubedl-io/kubedl/issues/249.

Difficulty: Normal Mentor: Xuelin Hong (@hoaresky )

hoaresky avatar May 30 '22 06:05 hoaresky