chronos icon indicating copy to clipboard operation
chronos copied to clipboard

50th, 75th, 95th and 99th percentile timing issue

Open jie-qin opened this issue 8 years ago • 1 comments

There's no way 50th, 75th, 95th and 99th percentile timing of ~1k job executions are exactly the same. It actually varies a lot.

I searched around and couldn't find any relevant information. Most likely I am doing something wrong. Can someone help point a direction or share some insights?

Much appreciate.

image

jie-qin avatar Apr 19 '16 16:04 jie-qin

Old issue, but if anyone looks at this -- it's probably due to Dropwizard's MetricRegistry.histogram() method used in the JobMetrics class. By default, Dropwizard Metrics is exponentially decaying old data with a "factor of 0.015, which heavily biases the reservoir to the past 5 minutes of measurements." (See ExponentiallyDecayingReservoir.java) We have a fork of Chronos and we're changing the relevant line to

registry.register(MetricRegistry.name("jobs", "run", name, jobName), new Histogram(new SlidingWindowReservoir(100)))

mwilbz avatar Mar 14 '18 22:03 mwilbz