chronos
chronos copied to clipboard
50th, 75th, 95th and 99th percentile timing issue
There's no way 50th, 75th, 95th and 99th percentile timing of ~1k job executions are exactly the same. It actually varies a lot.
I searched around and couldn't find any relevant information. Most likely I am doing something wrong. Can someone help point a direction or share some insights?
Much appreciate.
Old issue, but if anyone looks at this -- it's probably due to Dropwizard's MetricRegistry.histogram() method used in the JobMetrics class. By default, Dropwizard Metrics is exponentially decaying old data with a "factor of 0.015, which heavily biases the reservoir to the past 5 minutes of measurements." (See ExponentiallyDecayingReservoir.java) We have a fork of Chronos and we're changing the relevant line to
registry.register(MetricRegistry.name("jobs", "run", name, jobName), new Histogram(new SlidingWindowReservoir(100)))