java-operator-sdk icon indicating copy to clipboard operation
java-operator-sdk copied to clipboard

Adding `HasMetadata` as interface causes resources to be incorrectly deserialized as `GenericKubernetesResource`

Open adriansuarez opened this issue 1 year ago • 9 comments

Bug Report

Adding the HasMetadata as an interface for my class that extends CustomResource<SPEC, STATUS> cause the operator framework to incorrectly deserialize it as GenericKubernetesResource instead of its actual class.

What did you do?

I added an interface to my CustomResource that extends HasMetadata.

For example,

@Group(...)
@Version(...)
public class MyCRD extends CustomResource<MySpec, MyStatus> implements Namespaced, HasUsefulFunctionality {
}

The HasUsefulFunctionality interface extends HasMetadata and is used to factor out common code that would otherwise by duplicated on all CRDs.

It is useful for it to extend HasMetadata, because it allows default methods to invoke this.getMetadata() to do things like inject the observedGeneration for conditions that get attached to the resource status.

What did you expect to see?

I expected things to work without any problems. The CustomResource superclass already implements HasMetadata, so declaring my CRD as an implementer of HasMetadata again should have no effect.

What did you see instead? Under which circumstances?

The Operator framework seems to get confused by the presence of that interface again, and incorrectly deserializes the resource as KubernetesGenericResource.

Many different stacktraces similar to the following appear all over the place and the operator is completely unusable:

20:52:42.301 [-1370171034-pool-1-thread-2] ERROR io.fabric8.kubernetes.client.informers.impl.cache.SharedProcessor -- ... failed invoking InformerEventSource{resourceClass: MyCRD} event handler: io.fabric8.kubernetes.api.model.GenericKubernetesResource incompatible with MyCRD
java.lang.ClassCastException: io.fabric8.kubernetes.api.model.GenericKubernetesResource incompatible with MyCRD
	at io.javaoperatorsdk.operator.processing.event.source.informer.DefaultPrimaryToSecondaryIndex.onAddOrUpdate(DefaultPrimaryToSecondaryIndex.java:21)
	at io.javaoperatorsdk.operator.processing.event.source.informer.InformerEventSource.onAdd(InformerEventSource.java:120)
	at io.javaoperatorsdk.operator.processing.event.source.informer.InformerEventSource.onAdd(InformerEventSource.java:66)
	at io.fabric8.kubernetes.client.informers.impl.cache.ProcessorListener$AddNotification.handle(ProcessorListener.java:103)
	at io.fabric8.kubernetes.client.informers.impl.cache.ProcessorListener.add(ProcessorListener.java:50)
	at io.fabric8.kubernetes.client.informers.impl.cache.SharedProcessor.lambda$distribute$0(SharedProcessor.java:91)
	at io.fabric8.kubernetes.client.informers.impl.cache.SharedProcessor.lambda$distribute$1(SharedProcessor.java:114)
	at io.fabric8.kubernetes.client.utils.internal.SerialExecutor.lambda$execute$0(SerialExecutor.java:57)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:857)

Environment

Kubernetes cluster type:

Docker Desktop / Kubernetes 1.29.2

Java Operator SDK 4.9.2

$ java -version
openjdk version "17.0.10" 2024-01-16
IBM Semeru Runtime Open Edition 17.0.10.0 (build 17.0.10+7)
Eclipse OpenJ9 VM 17.0.10.0 (build openj9-0.43.0, JRE 17 Mac OS X amd64-64-Bit Compressed References 20240116_636 (JIT enabled, AOT enabled)
OpenJ9   - 2c3d78b48
OMR      - ea8124dbc
JCL      - 2aad089841f based on jdk-17.0.10+7)
$ kubectl version
Client Version: v1.29.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2

Possible Solution

Removing HasMetadata as an interface solves the problem.

adriansuarez avatar Aug 04 '24 01:08 adriansuarez

This is more an issue for fabric8 client. cc @manusa @shawkins

csviri avatar Aug 04 '24 07:08 csviri

HasMetadata

The deserialization to generic happens when the class type is not registered for the group/version/kind.

However as long as the watch call is typed, the watch logic will automatically convert to the expected class type when needed: https://github.com/fabric8io/kubernetes-client/blob/806a918669aa328cbc3dc0be8d1c7339d06bdacb/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/dsl/internal/AbstractWatchManager.java#L286

That seems to imply the watch was started generically.

shawkins avatar Aug 05 '24 11:08 shawkins

@adriansuarez do you have a way to reproduce the issue, perhaps?

metacosm avatar Aug 05 '24 16:08 metacosm

@metacosm I don't have a complete but trivial reproduction scenario to share, but in my case, the following change causes this to happen on my operator which is currently working:


+import io.fabric8.kubernetes.api.model.HasMetadata;
 import io.fabric8.kubernetes.api.model.Namespaced;
 import io.fabric8.kubernetes.client.CustomResource;
 import io.fabric8.kubernetes.model.annotation.Group;
@@ -17,7 +18,7 @@ import jakarta.annotation.Nullable;
 @Group("cp.nuodb.com")
 @Version("v1beta1")
 public class DatabaseClaim extends CustomResource<DatabaseClaimSpec, DatabaseClaimStatus>
-        implements Namespaced, HasDbaasResource<DatabaseModel> {
+        implements Namespaced, HasDbaasResource<DatabaseModel>, HasMetadata {
 
     private static final long serialVersionUID = -31979304345542506L;
 

While debugging the issue, I stripped away a lot of stuff to reduce my operator to almost nothing. I would still see some version of the error message above, in the reconciliation loop when a resource of the CRD kind was created.

I suspect if you started with a trivial sample and added the HasMetadata interface to the CRD class, this would happen, because my extremely stripped down operator was almost equivalent to that before I found a workaround and moved on.

adriansuarez avatar Aug 05 '24 18:08 adriansuarez

OK, can indeed replicate the issue. Looking into it.

metacosm avatar Aug 07 '24 15:08 metacosm

This does indeed look like a Fabric8 client issue since I haven't seen anything on the JOSDK side that would explain this behavior. Looking at the client code to try to determine what's causing this issue.

metacosm avatar Aug 07 '24 17:08 metacosm

Looks like something funky is going on with the deserialization of the CR but I'm not sure what yet.

metacosm avatar Aug 07 '24 20:08 metacosm

If you annotate HasUsefulFunctionality with:

@JsonDeserialize(using = JsonDeserializer.None.class)

the issue should probably get fixed.

This seems like a very specific use case where you probably need multiple inheritance or have different CRD, ones that implement the interface and others that don't. Otherwise, you could probably extend the CustomResource class instead of the interface and have all your CRDs extend your specific implementation instead.

manusa avatar Aug 08 '24 08:08 manusa

I've verified that adding the @JsonDeserialize(using = JsonDeserializer.None.class) addresses the issue in my reproducer code. The issue is due to the peculiarity of how the Fabric8 client handles deserialization and not having that Jackson annotation on your code basically triggered its KubernetesDeserializer, which we actually don't want to do, as it is meant to deserialize "known" Kubernetes resources, not custom ones.

metacosm avatar Aug 08 '24 08:08 metacosm

Will close this as we have a fix. Please re-open if required.

metacosm avatar Sep 18 '24 13:09 metacosm