csi-driver-smb icon indicating copy to clipboard operation
csi-driver-smb copied to clipboard

Liveness Probe of Container "node-driver-registrar" failes

Open E2M-ITOPS opened this issue 3 years ago • 4 comments

What happened: We are running a TKGI managed Kubernetes Cluster with priviliged mode. I tried to deploy several versions of the csi-driver-smb but it runs always in a chrashloop backoff because a livenessprobe failed.

I checked the locations of kubelet but they seems to be fine, everything is located unter /var/lib/kubelet as expected

I get the following Error from csi-smb-node:

Liveness probe failed: F0323 08:32:37.215762 38 main.go:157] Kubelet plugin registration hasn't succeeded yet, file=/var/lib/kubelet/plugins/smb.csi.k8s.io/registration doesn't exist. goroutine 1 [running]: k8s.io/klog/v2.stacks(0xc0000b0001, 0xc0000f8600, 0xa6, 0xfa) /mnt/vss/_work/1/go/src/github.com/kubernetes-csi/node-driver-registrar/vendor/k8s.io/klog/v2/klog.go:1026 +0xb9 k8s.io/klog/v2.(*loggingT).output(0xf9d840, 0xc000000003, 0x0, 0x0, 0xc000223c00, 0x0, 0xcd3ca3, 0x7, 0x9d, 0x0) /mnt/vss/_work/1/go/src/github.com/kubernetes-csi/node-driver-registrar/vendor/k8s.io/klog/v2/klog.go:975 +0x1e5 k8s.io/klog/v2.(*loggingT).printf(0xf9d840, 0xc000000003, 0x0, 0x0, 0x0, 0x0, 0xb32465, 0x48, 0xc0002a68b0, 0x1, ...) /mnt/vss/_work/1/go/src/github.com/kubernetes-csi/node-driver-registrar/vendor/k8s.io/klog/v2/klog.go:753 +0x19a k8s.io/klog/v2.Fatalf(...) /mnt/vss/_work/1/go/src/github.com/kubernetes-csi/node-driver-registrar/vendor/k8s.io/klog/v2/klog.go:1514 main.main() /mnt/vss/_work/1/go/src/github.com/kubernetes-csi/node-driver-registrar/cmd/csi-node-driver-registrar/main.go:157 +0xfc6 goroutine 20 [chan receive]: k8s.io/klog/v2.(*loggingT).flushDaemon(0xf9d840) /mnt/vss/_work/1/go/src/github.com/kubernetes-csi/node-driver-registrar/vendor/k8s.io/klog/v2/klog.go:1169 +0x8b created by k8s.io/klog/v2.init.0 /mnt/vss/_work/1/go/src/github.com/kubernetes-csi/node-driver-registrar/vendor/k8s.io/klog/v2/klog.go:420 +0xdf

What you expected to happen:

Succesfull liveness probe of the csi-smb-nodes

How to reproduce it:

See above.

Anything else we need to know?:

I dont think so

Environment:

  • CSI Driver version: v.1.5.0
  • Kubernetes version (use kubectl version): 1.22.2
  • OS (e.g. from /etc/os-release): Ubuntu 16.04.7 LTS (Xenial Xerus)
  • Kernel (e.g. uname -a):
  • Install tools: kubectl
  • Others:

E2M-ITOPS avatar Mar 23 '22 08:03 E2M-ITOPS

the error log shows it has issue talking to kubelet, similar to issue: https://github.com/kubernetes-csi/csi-driver-smb/issues/335#issuecomment-896759440

andyzhangx avatar Mar 23 '22 10:03 andyzhangx

Thanks @andyzhangx I made the "bind" and "make shared" manual on the worker nodes.

I still receive an error but this one is a bit different:

Liveness probe failed: F0324 13:45:49.693049 14 main.go:159] Kubelet plugin registration hasn't succeeded yet, file=/var/lib/kubelet/plugins/smb.csi.k8s.io/registration doesn't exist. goroutine 1 [running]: k8s.io/klog/v2.stacks(0xc0000b2001, 0xc000112600, 0xa6, 0xfa) /workspace/vendor/k8s.io/klog/v2/klog.go:1026 +0xb9 k8s.io/klog/v2.(*loggingT).output(0xffb200, 0xc000000003, 0x0, 0x0, 0xc0001b5960, 0x0, 0xd1fa25, 0x7, 0x9f, 0x0) /workspace/vendor/k8s.io/klog/v2/klog.go:975 +0x1e5 k8s.io/klog/v2.(*loggingT).printf(0xffb200, 0xc000000003, 0x0, 0x0, 0x0, 0x0, 0xb785b1, 0x48, 0xc00023cb40, 0x1, ...) /workspace/vendor/k8s.io/klog/v2/klog.go:753 +0x19a k8s.io/klog/v2.Fatalf(...) /workspace/vendor/k8s.io/klog/v2/klog.go:1514 main.main() /workspace/cmd/csi-node-driver-registrar/main.go:159 +0xfc6 goroutine 20 [chan receive]: k8s.io/klog/v2.(*loggingT).flushDaemon(0xffb200) /workspace/vendor/k8s.io/klog/v2/klog.go:1169 +0x8b created by k8s.io/klog/v2.init.0 /workspace/vendor/k8s.io/klog/v2/klog.go:420 +0xdf

ghost avatar Mar 24 '22 13:03 ghost

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jun 22 '22 14:06 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Jul 22 '22 15:07 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-triage-robot avatar Aug 21 '22 16:08 k8s-triage-robot

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Aug 21 '22 16:08 k8s-ci-robot