kops icon indicating copy to clipboard operation
kops copied to clipboard

After upgrading a cluster using Calico to kOps 1.23.1, kube-controller-manager shows a lot of error logs

Open DingGGu opened this issue 2 years ago • 5 comments

/kind bug

1. What kops version are you running? The command kops version, will display this information.

Version 1.23.1 (git-83ccae81a636b8e870e430b6faaeeb5d10d9b832)

2. What Kubernetes version are you running? kubectl version will print the version if a cluster is running or provide the Kubernetes version specified as a kops flag. v1.22.8

3. What cloud provider are you using? aws

4. What commands did you run? What is the simplest way to reproduce this issue? kubectl -n kube-system logs kube-controller-manager

5. What happened after the commands executed? A lot of recurring logs come from kube-controller-manager

E0425 07:14:36.573615       1 plugins.go:752] "Error dynamically probing plugins" err="error creating Flexvolume plugin from directory nodeagent~uds, skipping. Error: unexpected end of JSON input"
E0425 07:14:36.674744       1 driver-call.go:262] Failed to unmarshal output for command: init, output: "", error: unexpected end of JSON input
W0425 07:14:36.674764       1 driver-call.go:149] FlexVolume: driver call failed: executable: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds/uds, args: [init], error: fork/exec /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds/uds: no such file or directory, output: ""

6. What did you expect to happen?

7. Please provide your cluster manifest. Execute kops get --name my.example.com -o yaml to display your cluster manifest. You may want to remove your cluster name and other sensitive information.

debian 10, calico

8. Please run the commands with most verbose logging by adding the -v 10 flag. Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know? Maybe related with https://github.com/projectcalico/calico/issues/5356

DingGGu avatar May 05 '22 11:05 DingGGu

Also seeing this message on the same version.

jhohertz avatar Jun 05 '22 04:06 jhohertz

Also seeing this on kops 1.23.2

wskulley avatar Jun 24 '22 16:06 wskulley

Wondering if this is related to the channels apply bug.

You can try downloading the channels binary from the 1.24.0-beta.3 release and run

channels apply channel $KOPS_STATE_STORE/$CLUSTER:NAME/addons/bootstrap-channel.yaml -v2

If it outputs failures, add the --yes flag.

olemarkus avatar Jun 27 '22 17:06 olemarkus

I was also seeing this exact same issue.

Running the channels binary had no impact, but as of kOps release 1.24.1, these errors have stopped.

I suspect this issue was in fact related to the upstream calico bug and has now been resolved.

stewbernetes avatar Aug 03 '22 18:08 stewbernetes

Yes, I think the issue was resolved in https://github.com/projectcalico/calico/pull/6605

ScOut3R avatar Oct 13 '22 11:10 ScOut3R

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 11 '23 12:01 k8s-triage-robot

Restart appears in kOps 1.26.5, Kubernetes 1.24.16

Aug 10 09:56:27 ip-x-x-x-x kubelet[2687]: W0810 09:56:27.645876    2687 driver-call.go:149] FlexVolume: driver call failed: executable: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds/uds, args: [init], error: fork/exec /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds/uds: no such file or directory, output: ""
Aug 10 09:56:28 ip-x-x-x-x kubelet[2687]: E0810 09:56:28.653319    2687 plugins.go:740] "Error dynamically probing plugins" err="error creating Flexvolume plugin from directory nodeagent~uds, skipping. Error: unexpected end of JSON input"
Aug 10 09:56:28 ip-x-x-x-x kubelet[2687]: E0810 09:56:28.655167    2687 driver-call.go:262] Failed to unmarshal output for command: init, output: "", error: unexpected end of JSON input 

DingGGu avatar Aug 11 '23 07:08 DingGGu

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Jan 20 '24 03:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Feb 19 '24 04:02 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Feb 19 '24 04:02 k8s-ci-robot