spring-batch icon indicating copy to clipboard operation
spring-batch copied to clipboard

Activation of Actuator & Prometheus rise a WARN in the log due to bad configuration

Open pkernevez opened this issue 10 months ago • 4 comments

Bug description When adding dependencies to Actuator & Prometheus the first time a job is run (meaning each time the JVM is restarted) there is a Warn log that is logged. The raison is: the same metric is initialed in 2 locations with different tags. The logs is:

19:07:27.786  WARN 62730 [        batch-1] i.m.p.PrometheusMeterRegistry  : The meter (MeterId{name='spring.batch.job.active', tags=[tag(spring.batch.job.name=cobJob),tag(spring.batch.job.status=UNKNOWN)]}) registration has failed: Prometheus requires that all meters with the same name have the same set of tag keys. There is already an existing meter named 'spring.batch.job.active' containing tag keys [spring.batch.job.active.name]. The meter you are attempting to register has keys [spring.batch.job.name, spring.batch.job.status]. Note that subsequent logs will be logged at debug level.

Environment Spring Boot: 3.4.2 SpringBatch: 5.2.1 Java: 21+

Steps to reproduce Add the 2 following dependencies in your POM:

        <!--  New dependencies -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-actuator</artifactId>
        </dependency>
        <dependency>
            <groupId>io.micrometer</groupId>
            <artifactId>micrometer-registry-prometheus</artifactId>
        </dependency>
    </dependencies>

Run a batch and the log will be outputted in your logs.

The issues is in AbstractJob.execute:

	@Override
	public final void execute(JobExecution execution) {

		Assert.notNull(execution, "jobExecution must not be null");
		execution.getExecutionContext().put(SpringBatchVersion.BATCH_VERSION_KEY, SpringBatchVersion.getVersion());

		if (logger.isDebugEnabled()) {
			logger.debug("Job execution starting: " + execution);
		}

		JobSynchronizationManager.register(execution);
		String activeJobMeterName = "job.active";
		LongTaskTimer longTaskTimer = BatchMetrics.createLongTaskTimer(this.meterRegistry, activeJobMeterName,
				"Active jobs", Tag.of(BatchMetrics.METRICS_PREFIX + activeJobMeterName + ".name",
						execution.getJobInstance().getJobName()));
		LongTaskTimer.Sample longTaskTimerSample = longTaskTimer.start();
		Observation observation = BatchMetrics
			.createObservation(BatchJobObservation.BATCH_JOB_OBSERVATION.getName(), new BatchJobContext(execution),
					this.observationRegistry)
			.contextualName(execution.getJobInstance().getJobName())
			.observationConvention(this.observationConvention)
			.start();
		try (Observation.Scope scope = observation.openScope()) {

  1. The line L285 LongTaskTimer longTaskTimer = BatchMetrics.createLongTaskTimer creates the explicit metrics with full name spring.batch.job.activeand tag spring.batch.job.active.name.

  2. And the line .createObservation(BatchJobObservation.BATCH_JOB_OBSERVATION.getName(), new BatchJobContext(execution), is creating an observation with the same name spring.batch.job.active and the execution context. This execution context is used later to create the tags: in io.micrometer.core.instrument.observation.DefaultMeterObservationHandler#75:

            LongTaskTimer.Sample longTaskSample = LongTaskTimer.builder(context.getName() + ".active")
                .tags(createTags(context))
                .register(meterRegistry)
                .start();

But the tags are now dynamicaly created fron the execution context:tags=[tag(spring.batch.job.name=importUserJob),tag(spring.batch.job.status=UNKNOWN)]}) 3. The tags are different and this is not supported by the PrometheusMeterRegistry framework https://github.com/micrometer-metrics/micrometer/blob/074e351e03b39ff330f7890402dd44a906ee31ee/implementations/micrometer-registry-prometheus/src/main/java/io/micrometer/prometheusmetrics/PrometheusMeterRegistry.java#L587 Because on the line L582 the tags were not the same.

Expected behavior Not log a warn.

Minimal Complete Reproducible example I created a fork of https://github.com/spring-guides/gs-batch-processing with a single commit. Just run the app (not the test, because the test is not starting the full stack): https://github.com/pkernevez/gs-batch-processing/tree/warn-issue

/opt/homebrew/Cellar/openjdk/21.0.1/libexec/openjdk.jdk/Contents/Home/bin/java -XX:TieredStopAtLevel=1 -Dspring.output.ansi.enabled=always -Dcom.sun.management.jmxremote -Dspring.jmx.enabled=true -Dspring.liveBeansView.mbeanDomain -Dspring.application.admin.enabled=true -Dmanagement.endpoints.jmx.exposure.include=* -javaagent:/Applications/IntelliJ IDEA_2024_2.4.app/Contents/lib/idea_rt.jar=63083:/Applications/IntelliJ IDE..../.m2/repository/io/prometheus/prometheus-metrics-exposition-textformats/1.3.5/prometheus-metrics-exposition-textformats-1.3.5.jar com.example.batchprocessing.BatchProcessingApplication

  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/

 :: Spring Boot ::                (v3.4.2)

2025-01-28T21:27:56.755+01:00  INFO 70368 --- [           main] c.e.b.BatchProcessingApplication         : Starting BatchProcessingApplication using Java 21.0.1 with PID 70368 (/Users/pke/SCM/Git/Shift/gs-batch-processing/complete/target/classes started by pke in /Users/pke/SCM/Git/Shift/gs-batch-processing/complete)
2025-01-28T21:27:56.756+01:00  INFO 70368 --- [           main] c.e.b.BatchProcessingApplication         : No active profile set, falling back to 1 default profile: "default"
2025-01-28T21:27:57.126+01:00  INFO 70368 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Starting...
2025-01-28T21:27:57.313+01:00  INFO 70368 --- [           main] com.zaxxer.hikari.pool.PoolBase          : HikariPool-1 - Driver does not support get/set network timeout for connections. (feature not supported)
2025-01-28T21:27:57.314+01:00  INFO 70368 --- [           main] com.zaxxer.hikari.pool.HikariPool        : HikariPool-1 - Added connection org.hsqldb.jdbc.JDBCConnection@6cbd0674
2025-01-28T21:27:57.315+01:00  INFO 70368 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Start completed.
2025-01-28T21:27:57.630+01:00  INFO 70368 --- [           main] c.e.b.BatchProcessingApplication         : Started BatchProcessingApplication in 1.018 seconds (process running for 1.269)
2025-01-28T21:27:57.631+01:00  INFO 70368 --- [           main] o.s.b.a.b.JobLauncherApplicationRunner   : Running default command line with: []
2025-01-28T21:27:57.646+01:00  INFO 70368 --- [           main] o.s.b.c.l.s.TaskExecutorJobLauncher      : Job: [SimpleJob: [name=importUserJob]] launched with the following parameters: [{}]

2025-01-28T21:27:57.651+01:00  WARN 70368 --- [           main] i.m.p.PrometheusMeterRegistry            : The meter (MeterId{name='spring.batch.job.active', tags=[tag(spring.batch.job.name=importUserJob),tag(spring.batch.job.status=UNKNOWN)]}) registration has failed: Prometheus requires that all meters with the same name have the same set of tag keys. There is already an existing meter named 'spring.batch.job.active' containing tag keys [spring.batch.job.active.name]. The meter you are attempting to register has keys [spring.batch.job.name, spring.batch.job.status]. Note that subsequent logs will be logged at debug level.

2025-01-28T21:27:57.657+01:00  INFO 70368 --- [           main] o.s.batch.core.job.SimpleStepHandler     : Executing step: [step1]
2025-01-28T21:27:57.662+01:00  INFO 70368 --- [           main] c.e.batchprocessing.PersonItemProcessor  : Converting (Person[firstName=Jill, lastName=Doe]) into (Person[firstName=JILL, lastName=DOE])
2025-01-28T21:27:57.663+01:00  INFO 70368 --- [           main] c.e.batchprocessing.PersonItemProcessor  : Converting (Person[firstName=Joe, lastName=Doe]) into (Person[firstName=JOE, lastName=DOE])
2025-01-28T21:27:57.664+01:00  INFO 70368 --- [           main] c.e.batchprocessing.PersonItemProcessor  : Converting (Person[firstName=Justin, lastName=Doe]) into (Person[firstName=JUSTIN, lastName=DOE])
2025-01-28T21:27:57.667+01:00  INFO 70368 --- [           main] c.e.batchprocessing.PersonItemProcessor  : Converting (Person[firstName=Jane, lastName=Doe]) into (Person[firstName=JANE, lastName=DOE])
2025-01-28T21:27:57.667+01:00  INFO 70368 --- [           main] c.e.batchprocessing.PersonItemProcessor  : Converting (Person[firstName=John, lastName=Doe]) into (Person[firstName=JOHN, lastName=DOE])
2025-01-28T21:27:57.668+01:00  INFO 70368 --- [           main] o.s.batch.core.step.AbstractStep         : Step: [step1] executed in 11ms
2025-01-28T21:27:57.670+01:00  INFO 70368 --- [           main] c.e.b.JobCompletionNotificationListener  : !!! JOB FINISHED! Time to verify the results
2025-01-28T21:27:57.672+01:00  INFO 70368 --- [           main] c.e.b.JobCompletionNotificationListener  : Found <Person[firstName=JILL, lastName=DOE]> in the database.
2025-01-28T21:27:57.672+01:00  INFO 70368 --- [           main] c.e.b.JobCompletionNotificationListener  : Found <Person[firstName=JOE, lastName=DOE]> in the database.
2025-01-28T21:27:57.672+01:00  INFO 70368 --- [           main] c.e.b.JobCompletionNotificationListener  : Found <Person[firstName=JUSTIN, lastName=DOE]> in the database.
2025-01-28T21:27:57.672+01:00  INFO 70368 --- [           main] c.e.b.JobCompletionNotificationListener  : Found <Person[firstName=JANE, lastName=DOE]> in the database.
2025-01-28T21:27:57.672+01:00  INFO 70368 --- [           main] c.e.b.JobCompletionNotificationListener  : Found <Person[firstName=JOHN, lastName=DOE]> in the database.
2025-01-28T21:27:57.673+01:00  INFO 70368 --- [           main] o.s.b.c.l.s.TaskExecutorJobLauncher      : Job: [SimpleJob: [name=importUserJob]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 18ms
2025-01-28T21:27:57.676+01:00  INFO 70368 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Shutdown initiated...
2025-01-28T21:27:57.691+01:00  INFO 70368 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Shutdown completed.

Process finished with exit code 0

pkernevez avatar Jan 28 '25 20:01 pkernevez

The documentation also specifically states that that the only tag for spring.batch.job.active is name.

https://docs.spring.io/spring-batch/reference/monitoring-and-metrics.html

madorb avatar Feb 05 '25 16:02 madorb

Thx for posting this bug, we are having the same issue. Any workarounds to get rid of this warning?

geoludbit avatar Feb 19 '25 22:02 geoludbit

You can work around the issue with a snippet like

Metrics.globalRegistry.config().meterFilter(MeterFilter.denyNameStartsWith("spring.batch.job.active"));

The snippet only excludes the metric spring.batch.job.active, and except for the changed prefix the snippet comes from the reference documentation: https://docs.spring.io/spring-batch/reference/monitoring-and-metrics.html#disabling-metrics

hpoettker avatar Feb 24 '25 10:02 hpoettker

Thanks for the workaround. Any news about the bug itself ?

pkernevez avatar Apr 10 '25 15:04 pkernevez

I am also encountering this bug, and while harmless, it is creating distracting log warnings, so it would be nice to see it fixed.

mikepictor avatar Aug 21 '25 13:08 mikepictor

Thank you all for reporting this issue and for your feedback! I will address this as part of #4689, which I have already planned for v6. If the fix is backward compatible, I will backport it to 5.2.x.

fmbenhassine avatar Aug 21 '25 13:08 fmbenhassine

This was resolved as part of #4968. Spring Batch now only uses the unified Observation API from Micrometer (so the metric in question is not created twice with the same tag). I tried the getting started guide with the latest Boot v4 and Batch v6 snapshots and the warning is not logged anymore.

Thank you for reporting this.

fmbenhassine avatar Sep 23 '25 19:09 fmbenhassine