kubernetes-client icon indicating copy to clipboard operation
kubernetes-client copied to clipboard

StatefulSet or Deployment cannot be restarted. Failure executing PATCH

Open b0m123 opened this issue 6 months ago • 11 comments
trafficstars

Describe the bug

Hi there. After upgrading from 7.1.0 to 7.2.0 (7.3.0 has same problem) StatefulSet or Deployment restarting stopped working for just deployed application (deployed via Helm). Error occurred: Failure executing: PATCH at: https://127.0.0.1:6443/apis/apps/v1/namespaces/default/statefulsets/keycloak. Message: the server rejected our request due to an error in our request. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[], group=null, kind=null, name=null, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=the server rejected our request due to an error in our request, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).

Interesting that if I restart statefulset using kubectl rollout status statefulset keycloak command the client is able to redeploy.

Can it be caused by this issue fix? https://github.com/fabric8io/kubernetes-client/issues/6892

Here is statefulset metadata: kubectl get statefulset keycloak -o=jsonpath='{.spec.template.metadata}' { "creationTimestamp": null, "labels": { "app": "keycloak", "app.kubernetes.io/component": "keycloak", "app.kubernetes.io/managed-by": "Helm", "app.kubernetes.io/name": "keycloak", "app.kubernetes.io/part-of": "IAM", "app.kubernetes.io/version": "22.112.1", "chart": "keycloak-22.112.1", "heritage": "Helm", "release": "keycloak" } } My guess the PATCH is not able to add kubectl.kubernetes.io/restartedAt annotation

Fabric8 Kubernetes Client version

7.3.0

Steps to reproduce

  1. Deploy StateFulSet or Deployment using Helm
  2. Check kubectl.kubernetes.io/restartedAt annotation is absent
  3. Dry to redeploy using apiClient.apps().statefulSets().inNamespace("default").withName("keycloak").rolling().restart()

Expected behavior

StatefulSet or Deployment is being redeployed with no error

Runtime

Kubernetes (vanilla)

Kubernetes API Server version

other (please specify in additional context)

Environment

Linux

Fabric8 Kubernetes Client Logs

Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://127.0.0.1:6443/apis/apps/v1/namespaces/default/statefulsets/keycloak. Message: the server rejected our request due to an error in our request. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[], group=null, kind=null, name=null, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=the server rejected our request due to an error in our request, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
	at app//io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:642)
	at app//io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:622)
	at app//io.fabric8.kubernetes.client.dsl.internal.OperationSupport.assertResponseCode(OperationSupport.java:582)
	at app//io.fabric8.kubernetes.client.dsl.internal.OperationSupport.lambda$handleResponse$0(OperationSupport.java:549)
	at [email protected]/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:646)
	at [email protected]/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at [email protected]/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
	at app//io.fabric8.kubernetes.client.http.StandardHttpClient.lambda$completeOrCancel$10(StandardHttpClient.java:141)
	at [email protected]/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at [email protected]/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at [email protected]/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at [email protected]/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
	at app//io.fabric8.kubernetes.client.http.ByteArrayBodyHandler.onBodyDone(ByteArrayBodyHandler.java:51)
	at [email protected]/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at [email protected]/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at [email protected]/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at [email protected]/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
	at app//io.fabric8.kubernetes.client.vertx.VertxHttpRequest.lambda$consumeBytes$1(VertxHttpRequest.java:84)
	at app//io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:270)
	at app//io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:252)
	at app//io.vertx.core.http.impl.HttpEventHandler.handleEnd(HttpEventHandler.java:76)
	at app//io.vertx.core.http.impl.HttpClientResponseImpl.handleEnd(HttpClientResponseImpl.java:250)
	at app//io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.lambda$new$0(Http1xClientConnection.java:424)
	at app//io.vertx.core.streams.impl.InboundBuffer.handleEvent(InboundBuffer.java:279)
	at app//io.vertx.core.streams.impl.InboundBuffer.write(InboundBuffer.java:157)
	at app//io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.handleEnd(Http1xClientConnection.java:734)
	at app//io.vertx.core.impl.ContextImpl.lambda$execute$7(ContextImpl.java:329)
	at app//io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
	at app//io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
	at app//io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
	at app//io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at app//io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998)
	at app//io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at [email protected]/java.lang.Thread.run(Thread.java:1583)

Additional context

Kubernetes version v1.31.7

b0m123 avatar May 16 '25 11:05 b0m123

Can it be caused by this issue fix? #6892

It might be related, however, unless I'm missing something else, the PATCH operation implemented in #6893 should be fine

manusa avatar May 16 '25 13:05 manusa

I can't reproduce this using https://artifacthub.io/packages/helm/bitnami/keycloak in version 24.6.7.

Steps

  1. Installing the chart:
    helm install keycloak oci://registry-1.docker.io/bitnamicharts/keycloak
    
  2. Waiting for everything to be ready
  3. Ensuring statefulset doesn't have the kubectl.kubernetes.io/restartedAt annotation in template metadata.
  4. Rolling-restart
    client.apps().statefulSets().inNamespace("default").withName("keycloak").rolling().restart();
    

The statefulset gets restarted successfully.

manusa avatar May 20 '25 07:05 manusa

I've run some more tests and I agree, kubectl.kubernetes.io/restartedAt annotation might not be relevant to that case, but still it's not working as expected. I'm not really sure what causes this, but I have several applications deployed by Helm and some of thouse I'm not able to redeploy using Kubernetes Client. kubectl rollout restart works just fine though. Could you point me out what can I check to debug it pls?

b0m123 avatar May 20 '25 09:05 b0m123

kubectl rollout restart

AFAIR kubectl rollout restart does exactly the same we do (add an annotation the the pod template). So I'm not sure why that doesn't fail with kubectl and fails with ours.

The 422 status code is also confusing. The PATCH payload should be fine. The only thing I can think of is about the API server processing the restartedAt value and having parsing problems. Have you tries retrying the restart with the Fabric8 Client after the first failure? In case it's a non-deterministic parsing problem due to the timestamp formatting, retrying multiple times should eventually work, and might confirm this issue.

manusa avatar May 20 '25 09:05 manusa

So I'm not sure why that doesn't fail with kubectl and fails with ours.

The problem is likely with using Json Patch. If the annotation already exists, then doing an explicit add op will make the patch application fail.

The original code pre https://github.com/fabric8io/kubernetes-client/pull/6723 was correct. To support mock server usage rather than switching to Json Patch, the fix probably should have been to use JSON Merge instead of Strategic Merge - we aren't dealing with an array here, so I think that should be fine.

@manusa can you try reverting to a JSON Merge patch instead?

shawkins avatar May 20 '25 12:05 shawkins

I was able to reproduce the error using Kubernetes API. PATCH request failed with 422 response code

curl --location --request PATCH 'https://testing.hostname.org:6443/apis/apps/v1/namespaces/default/deployments/ui' \
--header 'Content-Type: application/json-patch+json' \
--header 'Authorization: ***' \
--data '[{"op":"add","value":"2025-05-20T12:49:54.766","path":"/spec/template/metadata/annotations/kubectl.kubernetes.io~1restartedAt"}]'

But if the annotation exists the request works as expected.

Btw, strategic-merge-patch works in both cases

curl --location --request PATCH 'https://testing.hostname.org:6443/apis/apps/v1/namespaces/default/deployments/ui' \
--header 'Content-Type: application/strategic-merge-patch+json' \
--header 'Authorization: ***' \
--data '{
  "spec": {
    "template": {
      "metadata": {
        "annotations": {
          "kubectl.kubernetes.io/restartedAt": "2025-05-20T12:49:54.766+02:00"
        }
      }
    }
  }
}'

b0m123 avatar May 20 '25 12:05 b0m123

Btw, strategic-merge-patch works in both cases

Can you test json merge (merge-patch+json) as well?

shawkins avatar May 20 '25 13:05 shawkins

Btw, strategic-merge-patch works in both cases

Can you test json merge (merge-patch+json) as well?

It works in all cases

curl --location --request PATCH 'https://testing.hostname.org:6443/apis/apps/v1/namespaces/default/deployments/ui' \
--header 'Content-Type: application/merge-patch+json' \
--header 'Authorization: ***' \
--data '{
  "spec": {
    "template": {
      "metadata": {
        "annotations": {
          "kubectl.kubernetes.io/restartedAt": "2025-05-20T12:49:54.766+02:00"
        }
      }
    }
  }
}'

b0m123 avatar May 20 '25 13:05 b0m123

I guess we can follow this approach for the fix, but I'm worried about why is this now failing in the first place:

If we check the JSON Patch, add should do an add or replace operation: https://datatracker.ietf.org/doc/html/rfc6902#section-4.1

Maybe it's because the annotations object doesn't exist in the first place? but still not sure about this.

manusa avatar May 20 '25 13:05 manusa

Just verified json-patch+json one more time. It's failing only if there is no annotation present in Pod template metadata. If I add any the request is working

b0m123 avatar May 20 '25 13:05 b0m123

I think we've determined that the problem will occur when there are no annotations present, in which case a Json Patch add can't automatically add a parent.

shawkins avatar May 20 '25 13:05 shawkins