micrometer-docs icon indicating copy to clipboard operation
micrometer-docs copied to clipboard

Max summary statistics for timers have a longer memory than the other statistics

Open sambishop opened this issue 4 years ago • 3 comments

I believe that I have found a bug when using the DynatraceMeterRegistry class. It looks like it would apply to all children of the StepMeterRegistry class though.

Steps to reproduce the bug:

  1. Create a DynatraceMeterRegistry instance, with the default one-minute reporting rate.
  2. Use the registry to create a timer.
  3. Use the timer to record a value.
  4. Let four minutes pass by without recording any other values. (Or fake it using a mock Clock instance.)

The initial statistics that are published will be consistent with what was recorded and with each other. ("count" == 1, "avg" == , and "max" == .) After the first minute, "count" and "avg" will drop to zero, but "max" will still be for two more minutes.

To work around this I am doing the following after creating the registry:

registry.config().meterFilter(new MeterFilter() {
    @Override
    public DistributionStatisticConfig configure(Meter.Id id, DistributionStatisticConfig config) {
        return id.getType() != Meter.Type.TIMER
                ? config
                : DistributionStatisticConfig.builder().bufferLength(1).build().merge(config);
    }
});

What this does is override the size of the ring buffer used by TimeWindowMax instances to be one instead of the default, which is three. (As determined by defaults set in the DistributionStatisticConfig class.)

My workaround is only lightly tested, but the only issue I have seen so far is that the summary statistics can still occasionally be inconsistent with each other. I don't see that the Micrometer code makes any attempt at updating and resetting the summary statistics atomically, which I think explains what I am seeing.

sambishop avatar Mar 31 '21 20:03 sambishop