spring-boot-admin icon indicating copy to clipboard operation
spring-boot-admin copied to clipboard

Bug: UI broken from version 3.5.3

Open cdprete opened this issue 4 months ago • 20 comments

Spring Boot Admin Server information

  • Version: 3.5.3

  • Spring Boot version: 3.5.5

  • Configured Security: None

  • Webflux or Servlet application: WebFlux

Client information

  • Spring Boot versions: 3.5.5

  • Used discovery mechanism: Eureka

  • Webflux or Servlet application: Both

Description

Hello.

I've updated my SBA instance to 3.5.3 and now no details are shown anymore. In particular:

  • The menu on the left is empty
  • Instances show only their health and their metadata. No info nor anything else.

The attached picture in yellow is the "old" environment with version 3.5.2:

Image

while the green one is the "new" environment with version 3.5.3:

Image

What I noticed is that restarting an instance X fixes the issue for that instance, which means that the new SBA somehow doesn't handle the state properly. Moreover, the tags in the homepage are now always gone, even for the instances which have been restarted, which is not ideal at all because then we don't know anymore about which instance we're referring to without going inside of it.

Again, in yellow the old version:

Image

and in green the new version:

Image

PS: the instance which has the version shown is the one that was restarted, which demonstrates the issue above about the state management and about the tags being now never shown in the homepage.

cdprete avatar Sep 16 '25 06:09 cdprete

/cc @SteKoe @erikpetzold since we were discussing about issues in https://github.com/codecentric/spring-boot-admin/issues/4646

cdprete avatar Sep 16 '25 06:09 cdprete

@erikpetzold @SteKoe I can confirm this occurring even with 3.5.4.

cdprete avatar Sep 17 '25 05:09 cdprete

Also the version of the application is not aligned anymore like before. Now, when the group is closed, it's rendered on the right:

Image

while when it's opened it's aligned to the left:

Image

Of course, to get these info rendered, I had to restart the instances. :-/

cdprete avatar Sep 17 '25 06:09 cdprete

Ok, also in the old version they were not really aligned:

Image

but the alignment there was a bit better (center-right jump vs left-right) since the jump in space was less.

cdprete avatar Sep 17 '25 07:09 cdprete

Can you please share a screenshot of browser's network activity including the response of e.g. /info actuator? In my local test setup it looks like this:

Image

SteKoe avatar Sep 17 '25 08:09 SteKoe

There is no call to info, metrics and so on but only to health as soon as from the homepage I got into one instance which was not restarted:

Image

cdprete avatar Sep 17 '25 08:09 cdprete

So the registration fails to expose the endpoints? See, when i check the response of /applications the instance inside of the application contains endpoints. This seems to be empty. So it is not a UI but a backend issue. Can you confirm that the data is empty in your case, too?

Image

SteKoe avatar Sep 17 '25 10:09 SteKoe

Hi @SteKoe.

Indeed, under endpoints, only the health one is listed for the application instances which have not been restarted yet since the update to SBA 3.5.3/4.

Image

cdprete avatar Sep 17 '25 11:09 cdprete

Okay, thanks. Whenever possible, restart the instance and report, if the endpoints are registered correctly. I will discuss this internally.

SteKoe avatar Sep 17 '25 11:09 SteKoe

Okay, thanks. Whenever possible, restart the instance and report, if the endpoints are registered correctly. I will discuss this internally.

I did restart some, like HIKU KFF CALCULATOR shown previously, but only because it was the DEV environment.

We've applications which we can't just freely restart them and we can't really escalate this to business just because the new "patch" version of SBA is now completely breaking our visibility around the monitoring.

cdprete avatar Sep 17 '25 11:09 cdprete

I am totally aware that this is not possible on all environments. We have a plenty of instances running ourselves and we run into the same struggle. That's why I wrote "whenever possible". :) – Anyways, we will check the endpoint detection mechanism and see, what might be the issue here.

Since you mentioned versioning of SBA, we initially started and still stick to the convention to match Spring Boot major and minor version in X.Y.z. Whereas z is our patch level. So whenever Spring Boot 3.6. will be released, SBA 3.6.0 will be shipped, too. We are also aware that this totally breaks the semantic versioning, but still think, that this is fine.

SteKoe avatar Sep 17 '25 11:09 SteKoe

So whenever Spring Boot 3.6. will be released, SBA 3.6.0 will be shipped, too. We are also aware that this totally breaks the semantic versioning, but still think, that this is fine.

That's totally fine for me as soon as SemVer is properly followed. Here I was updating from 3.5.23.5.33.5.4, so I wasn't expecting breaking changes to be honest. :D

I am totally aware that this is not possible on all environments. We have a plenty of instances running ourselves and we run into the same struggle. That's why I wrote "whenever possible". :)

That's why I usually update the DEV environment first. ;)

Anyways, we will check the endpoint detection mechanism and see, what might be the issue here.

Waiting for the fix. :)

cdprete avatar Sep 17 '25 12:09 cdprete

One further question: Which events are logged in the journal for the affected Instance? Can you name them, please? TIA!

SteKoe avatar Sep 17 '25 12:09 SteKoe

One further question: Which events are logged in the journal for the affected Instance? Can you name them, please? TIA!

Image

cdprete avatar Sep 17 '25 12:09 cdprete

tags in the homepage are now always gone

fixed in 2cf26d2bbc09df721137918b973fc88317a2c6cf

version of the application is not aligned anymore like before.

fixed in #4655

SteKoe avatar Sep 17 '25 19:09 SteKoe

Image

SteKoe avatar Sep 17 '25 19:09 SteKoe

@SteKoe the overall issue is caused by the Hazelcast cache, including https://github.com/codecentric/spring-boot-admin/issues/4340#issuecomment-3305991100.

It seems to me that what was stored in the cache is not compatible anymore from version 3.5.3, causing therefore all the issues we found out. If I cleanly deploy SBA (so, no rolling update - aka Hazelcast doesn't clone the cache to the new instance that's starting), everything works fine again. Unfortunately, that's not an option in the real life since it would then cause a downtime in production.

So, yeah, I agree that's probably an issue on the back-end as you were suggesting.

cdprete avatar Sep 18 '25 08:09 cdprete

I've some more info which may be useful:

com.hazelcast.nio.serialization.HazelcastSerializationException: java.io.InvalidClassException: de.codecentric.boot.admin.server.domain.values.Registration; local class incompatible: stream classdesc serialVersionUID = -4021660222452471078, local class serialVersionUID = -266391990383794588
	at com.hazelcast.internal.serialization.impl.SerializationUtil.handleException(SerializationUtil.java:114) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.serialization.impl.AbstractSerializationService.readObject(AbstractSerializationService.java:362) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.serialization.InternalSerializationService.readObject(InternalSerializationService.java:81) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.serialization.impl.ByteArrayObjectDataInput.readObject(ByteArrayObjectDataInput.java:604) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.serialization.impl.defaultserializers.AbstractCollectionStreamSerializer.deserializeEntriesInto(AbstractCollectionStreamSerializer.java:50) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.serialization.impl.defaultserializers.AbstractCollectionStreamSerializer.deserializeEntries(AbstractCollectionStreamSerializer.java:43) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.serialization.impl.defaultserializers.ArrayListStreamSerializer.read(ArrayListStreamSerializer.java:51) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.serialization.impl.defaultserializers.ArrayListStreamSerializer.read(ArrayListStreamSerializer.java:29) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.serialization.impl.StreamSerializerAdapter.read(StreamSerializerAdapter.java:44) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.serialization.impl.AbstractSerializationService.toObject(AbstractSerializationService.java:271) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.map.impl.record.ObjectRecordFactory.newRecord(ObjectRecordFactory.java:46) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.map.impl.recordstore.AbstractRecordStore.createRecord(AbstractRecordStore.java:162) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.map.impl.recordstore.AbstractEvictableRecordStore.createRecord(AbstractEvictableRecordStore.java:51) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.map.impl.recordstore.DefaultRecordStore.putOrUpdateReplicatedRecord(DefaultRecordStore.java:248) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.map.impl.operation.MapChunk.putOrUpdateReplicatedData(MapChunk.java:228) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.map.impl.operation.MapChunk.putInto(MapChunk.java:174) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.map.impl.operation.MapChunk.run(MapChunk.java:137) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.spi.impl.operationservice.Operation.call(Operation.java:192) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.spi.impl.operationexecutor.OperationRunner.runDirect(OperationRunner.java:176) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.partition.operation.MigrationOperation.runMigrationOperation(MigrationOperation.java:139) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.partition.operation.MigrationOperation.doRun(MigrationOperation.java:115) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.partition.operation.MigrationOperation.run(MigrationOperation.java:95) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.spi.impl.operationservice.Operation.call(Operation.java:192) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.call(OperationRunnerImpl.java:291) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:262) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:493) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.process(OperationThread.java:186) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.process(OperationThread.java:141) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.loop(OperationThread.java:134) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.executeRun(OperationThread.java:115) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:111) ~[hazelcast-5.5.0.jar:5.5.0]
Caused by: java.io.InvalidClassException: de.codecentric.boot.admin.server.domain.values.Registration; local class incompatible: stream classdesc serialVersionUID = -4021660222452471078, local class serialVersionUID = -266391990383794588
	at java.base/java.io.ObjectStreamClass.initNonProxy(Unknown Source) ~[na:na]
	at java.base/java.io.ObjectInputStream.readNonProxyDesc(Unknown Source) ~[na:na]
	at java.base/java.io.ObjectInputStream.readClassDesc(Unknown Source) ~[na:na]
	at java.base/java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) ~[na:na]
	at java.base/java.io.ObjectInputStream.readObject0(Unknown Source) ~[na:na]
	at java.base/java.io.ObjectInputStream$FieldValues.<init>(Unknown Source) ~[na:na]
	at java.base/java.io.ObjectInputStream.readSerialData(Unknown Source) ~[na:na]
	at java.base/java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) ~[na:na]
	at java.base/java.io.ObjectInputStream.readObject0(Unknown Source) ~[na:na]
	at java.base/java.io.ObjectInputStream.readObject(Unknown Source) ~[na:na]
	at java.base/java.io.ObjectInputStream.readObject(Unknown Source) ~[na:na]
	at com.hazelcast.internal.serialization.impl.defaultserializers.JavaDefaultSerializers$JavaSerializer.read(JavaDefaultSerializers.java:95) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.serialization.impl.defaultserializers.JavaDefaultSerializers$JavaSerializer.read(JavaDefaultSerializers.java:88) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.serialization.impl.StreamSerializerAdapter.read(StreamSerializerAdapter.java:44) ~[hazelcast-5.5.0.jar:5.5.0]
	at com.hazelcast.internal.serialization.impl.AbstractSerializationService.readObject(AbstractSerializationService.java:356) ~[hazelcast-5.5.0.jar:5.5.0]
	... 29 common frames omitted

@SteKoe I think this may explain why the registration was empty and, therefore, no URLs were available.

cdprete avatar Sep 22 '25 08:09 cdprete

Wouldn't that just happen, when running different versions of SBA on the same Hazelcast cluster or joining a cluster that contains "old" serialized classes?

Yes, we have changed the mechanism how service URL is determined in Registration.java but this only affects one field that was already existing.

SteKoe avatar Sep 23 '25 06:09 SteKoe

Wouldn't that just happen, when running different versions of SBA on the same Hazelcast cluster or joining a cluster that contains "old" serialized classes?

Which it happens if you do rolling updates in a cloud environment (e.g.: K8s/OCP) to have a zero-downtime system. If you've 2 instances deployed for business continuity, to perform the update OCP will:

  • deploy one new instance
  • once the new instance is up (liveness & readiness are ok), stop one old instance
  • repeat the steps until all the instances have been replaced

In such a scenario, there is then a time window where the cluster is composed by old and new instances. which need then to coexist.

cdprete avatar Sep 23 '25 06:09 cdprete