litmus
litmus copied to clipboard
[DNS Error Attack Issue] Helper Pod unable to execute Sudo command when attack is initiated
LitmusChaos version: 2.6.0 k8s v1.18.15
Context
We have set up Litmus in our Kubernetes environment and pods are up and running. When we try to execute a DNS Error Attack, what happens is
- Chaosengine successfully starts
- Chaosengine brings up a pod-dns-error pod
- Pod-dns-error pod will spin up a helper pod
The issue we are running into is once this helper pod is spun up, it will error out due to the following reason:
time="2022-02-08T17:53:54Z" level=error msg="[docker]: Failed to run docker inspect: sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?\n" time="2022-02-08T17:53:54Z" level=fatal msg="helper pod failed, err: exit status 1"
Would someone be able to provide any pointers as to how we can resolve this?
Logs from Pods
DNS Error Pod time="2022-02-17T17:32:53Z" level=info msg="Experiment Name: pod-dns-error" time="2022-02-17T17:32:53Z" level=info msg="[PreReq]: Getting the ENV for the pod-dns-error experiment" time="2022-02-17T17:32:55Z" level=info msg="[PreReq]: Updating the chaos result of pod-dns-error experiment (SOT)" time="2022-02-17T17:33:08Z" level=info msg="[Status]: Checking the status of the helper pods" time="2022-02-17T17:33:12Z" level=info msg="pod-dns-error-helper-usgokc helper pod is in Failed state" time="2022-02-17T17:33:14Z" level=info msg="[Wait]: waiting till the completion of the helper pod" time="2022-02-17T17:33:15Z" level=info msg="helper pod status: Failed" time="2022-02-17T17:33:15Z" level=info msg="[Status]: The running status of Pods are as follows" Pod=pod-dns-error-helper-usgokc Status=Failed time="2022-02-17T17:33:16Z" level=error msg="Chaos injection failed, err: helper pod failed"
DNS Error Helper Pod time="2022-02-17T18:00:27Z" level=info msg="Helper Name: dns-chaos" time="2022-02-17T18:00:27Z" level=info msg="[PreReq]: Getting the ENV variables" time="2022-02-17T18:00:30Z" level=info msg="Container ID: sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?" time="2022-02-17T18:00:30Z" level=error msg="[docker]: Failed to run docker inspect: sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?\n" time="2022-02-17T18:00:30Z" level=fatal msg="helper pod failed, err: exit status 1"
cc: @uditgaurav @gdsoumya
Looks like you are using any privilege escalation that doesn't allow to run sudo commands?
We have defined a PSP with least restrictive privilege
privileged: true
allowPrivilegeEscalation: true
allowedCapabilities:
- '*'
Hi @uditgaurav @gdsoumya Please let us know if we have a fix for the issue mentioned or any ETA. Thanks!
We have a document on PSP best practices here: https://litmuschaos.github.io/litmus/experiments/concepts/security/psp/
If you still have questions, please feel free to reach out at the Litmus Slack channel in k8s Slack (https://k8s.slack.com).