bee icon indicating copy to clipboard operation
bee copied to clipboard

sampling chunk fetch time (FOR TEST, DO NOT MERGE)

Open nugaon opened this issue 5 months ago • 0 comments

This PR adds Prometheus metrics to monitor worker wait times during chunk sampling in the ReserveSample function. Worker goroutines now track time between processing chunks, calculating the waiting time statistics reported via a new SamplingWorkerStats gauge. The implementation avoids Prometheus cardinality explosion by only reporting summary statistics at worker termination rather than per-observation metrics. These insights enable identification of bottlenecks in the sampling pipeline.

Checklist

  • [ ] I have read the coding guide.
  • [ ] My change requires a documentation update, and I have done it.
  • [ ] I have added tests to cover my changes.
  • [ ] I have filled out the description and linked the related issues.

Description

Open API Spec Version Changes (if applicable)

Motivation and Context (Optional)

Related Issue (Optional)

Screenshots (if appropriate):

nugaon avatar Jul 27 '25 10:07 nugaon