accumulo icon indicating copy to clipboard operation
accumulo copied to clipboard

Add metrics to track time compaction jobs are queued

Open cshannon opened this issue 4 months ago • 2 comments

This adds 2 new groups of stats to track information about queued compaction jobs. The first stat is a timer that keeps track of when jobs are being polled and give information on how often/fast jobs are exiting the queue. The second group of stats is a min/max/avg and is tracking age information about how long jobs are waiting on the queue.

This closes #4945

This is a draft for now because there is a couple outstanding things to do and it probably still needs a little bit of polishing but I wanted to post what I had to get feedback. The tests could also be improved a bit I think. I was trying to think of a good way to check that the stats are correct but the values are going to be non-deterministic as it's all timing in based so it would be hard to check exact values.

Todo/questions:

  1. Should we record a time of 0 if there's a job immediately available in both poll()?
  2. How should we handle async? I'm not sure the metrics really even apply to async as we are trying to track how long jobs are waiting on a queue before they are polled to check latency. However, if we were using async, then in theory the latency is near 0 as if there are compactors that are ready they don't need to poll and they'd just be waiting for the future to complete. Maybe we just record a time of 0 for async when a future is completed?
  3. What do we do with the timer on clear(), there is a TODO in the code.

cshannon avatar Oct 13 '24 12:10 cshannon