micrometer icon indicating copy to clipboard operation
micrometer copied to clipboard

Spring Boot Actuator prometheus metrics seems to work partially

Open SimSonic opened this issue 3 years ago • 3 comments

Describe the bug

In a Spring Boot (2.7.4 and several previous versions like 2.6.x) with actuator endpoint we have a problem.

Several classes with @Scheduled methods are also marked with @Timed annotation. Some of them are long jobs and so have longTask = true attribute. But working only half of them — sometimes long tasks, sometimes simple. Randomly, but never together.

Here you can see 2 restarts with switching.

image

Environment Using latest Spring Boot with Cloud 2021.0.4. All dependency versions are from BOM.

  • OS: Primary Linux (alpine), but manually seeing /actuator/prometheus endpoint on a local dev Win11 machine.
  • Java version: 17+.

Expected behavior Seeing all metrics together all the time.

SimSonic avatar Oct 05 '22 11:10 SimSonic

LongTaskTimer is a special Meter. It measures the running task only when it is active (see the docs). Once the task finishes, the LongTaskTimer resets.

To say it in a different way, a Timer looks into the past (what happened) because the recording happens after the task finishes. The LongTaskTimer on the other hand looks in the present (what is happening right now) since its recording is continuous and it disappears after the task finishes.

I'm not 100% sure I understand the problem and I don't really know what should I look for on the screenshot but if:

  • You want to look both into the past and the present, use both Timer and LongTaskTimer
  • If your measurements seem to be off, please make sure that your time series are unique and you don't record using the same timer (same name and set of tags) from two different tasks (i.e.: it's usually a good idea to name your LongTaskTimers the same way as Timer but with an ".active" postfix)

jonatan-ivanov avatar Oct 05 '22 18:10 jonatan-ivanov

Thank you.

I'm not 100% sure I understand the problem and I don't really know what should I look for on the screenshot

Once you have in you code such schedulers:

@Scheduled(cron = "${statistics.collect.cron:0 0 */1 * * *}") // Once in an hour
@Timed
public void someMetricOne() {
    // Short living code
}

@Scheduled(initialDelay = 1800_000L, fixedDelay = 1800_000L) // Every 30 minutes
@Timed(longTask = true)
public void someMetricTwo() {
    // Long time running job
}

you will have metric only for one of this two timers, depending on a first job which runs. Restarting application can switch timers.

If your measurements seem to be off, please make sure that your time series are unique and you don't record using the same timer.

I think this is a root cause of my problem. I have a bunch of @Timed annotations in code but with the same value attribute for name (default method.timed). I'll try to change @Timed(value = "method.timed.longtask", longTask = true) for every appearance.

If this is expected behavior, I'm ok,

SimSonic avatar Oct 06 '22 02:10 SimSonic

If this is expected behavior, I'm ok,

Also in this situation I think we have to emphasize this in documentation (like note near long task timer: if you have both annotation with and without longTask = true be careful about giving them diffirent names).

SimSonic avatar Oct 07 '22 07:10 SimSonic

@SimSonic are you using the TimedAspect to instrument those methods with the @Timed annotation? I don't think Spring has out-of-the-box support for @Timed on @Scheduled methods yet.

shakuzen avatar Feb 15 '23 11:02 shakuzen

@shakuzen The problem already solved. Yes, I'm using TimedAspect and @Timed over scheduled methods. The core of the problem was that it is not possible to have some schedulers with @Timed and some of them with @Timed(longTask = true). Only one half of them are working after restart — based on what type of scheduled job runs first, long or not.

I'm keeping this issue open just to mention that documentation leaks notes about that. In case of having both types of timers long task timers should have separate name specified manually.

For me I was able to remove all @Timed annotations and replace them with my own aspect registering simple and long timers for every scheduled method.

SimSonic avatar Feb 15 '23 16:02 SimSonic

I was able to reproduce the issue you mentioned. The underlying cause is because Prometheus does not allow different meters with the same name but a different set of tags. In the case of trying to use the same name with a Timer and LongTaskTimer created via the TimedAspect, by default the Timer will have tag keys class, method, and exception, while LongTaskTimer will have only class and method (we only know about an exception at the end of the execution of the method). We could make these tag keys consistent by default if we have a static exception=UNKNOWN tag on LTTs created via TimedAspect. We'll have to consider what we should document and/or what behavior changes to make.

Maybe related: we have this very old issue but I'm not sure about the full context at that time or if it is applicable to currently supported versions of the code: https://github.com/micrometer-metrics/micrometer/issues/24

For me I was able to remove all @Timed annotations and replace them with my own aspect registering simple and long timers for every scheduled method.

Once Spring Framework adds instrumentation for @Scheduled methods, I think Spring Boot can be updated to instrument those methods without having to use any aspects at all.

shakuzen avatar Feb 16 '23 08:02 shakuzen

Can we try if the problem is still there with the latest versions as @shakuzen mentioned?

marcingrzejszczak avatar Dec 20 '23 16:12 marcingrzejszczak

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

github-actions[bot] avatar Jan 02 '24 10:01 github-actions[bot]

Closing due to lack of requested feedback. If you would like us to look at this issue, please provide the requested information and we will re-open.

github-actions[bot] avatar Jan 10 '24 01:01 github-actions[bot]