numaflow
numaflow copied to clipboard
Golden Signals for Numaflow Pipeline
how do we quantify the reliability, availability, and performance of a Numaflow pipeline? While pipelines differ in their business objectives, what are the common denominators of core / key metrics that Numaflow (as a platform) needs to measure, monitor, and alert on?
We need to capture for 2 personas
- [ ] Numaflow health for platform components
- [ ] Controller
- [ ] ISB
- [ ]
numa
container
- [x] #1822
- [ ] backpressure
- [ ] UDF container health