security-profiles-operator icon indicating copy to clipboard operation
security-profiles-operator copied to clipboard

Failed to load Seccomp Profile

Open brness opened this issue 2 years ago • 3 comments

What happened:

According to the installation usage, I can set up the SeccompProfile in each of my node, but when it comes to the step of set up the Pod, it stuck in a CreateContainerError state, here is the description Error: failed to generate security options for container "test-container": failed to generate seccomp security options for container: cannot load seccomp profile "/var/lib/kubelet/seccomp/operator/profiles/audit.json": open /var/lib/kubelet/seccomp/operator/profiles/audit.json: no such file or directory I'm not sure what's wrong cause i follow the step of instruction strictly. Even when i try another Profile-Binding method, it's still failed.

What you expected to happen:

the pod running successfully with a custom profile

How to reproduce it (as minimally and precisely as possible):

just follow the step of installation-usage,

Anything else we need to know?:

i try a default profile which is provided in moby repository, by make a /profiles dir under /var/lib/kubelet/seccomp , and the here is the pod configuration

apiVersion: v1
kind: Pod
metadata:
  name: audit-pod
spec:
  securityContext:
    seccompProfile:
      type: Localhost
      localhostProfile: profiles/audit.json
  containers:
  - name: test-container
    image: nginx

and it works well, and I'm not sure if it's related with the symlink of the operator

Environment:

k8s:

ai-sz-k8s-master-node-1   Ready      controlplane,etcd   186d   v1.20.10
ai-sz-k8s-master-node-2   Ready      controlplane,etcd   186d   v1.20.10
ai-sz-k8s-master-node-3   Ready      controlplane,etcd   186d   v1.20.10
ai-sz-k8s-worker-gpu-1    NotReady   worker              77d    v1.20.10
ai-sz-k8s-worker-gpu-2    NotReady   worker              77d    v1.20.10
ai-sz-k8s-worker-node-1   Ready      worker              186d   v1.20.10
ai-sz-k8s-worker-node-2   Ready      worker              186d   v1.20.10
ai-sz-k8s-worker-node-3   Ready      worker              106d   v1.20.10

docker :

Client: Docker Engine - Community
 Version:           20.10.8
 API version:       1.41
 Go version:        go1.16.6
 Git commit:        3967b7d
 Built:             Fri Jul 30 19:55:49 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.8
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.6
  Git commit:       75249d8
  Built:            Fri Jul 30 19:54:13 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.9
  GitCommit:        e25210fe30a0a703442421b0f60afac609f950a3
 runc:
  Version:          1.0.1
  GitCommit:        v1.0.1-0-g4144b63
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
  • Cloud provider or hardware configuration: local cluster
  • OS (e.g: cat /etc/os-release):
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
  • Kernel (e.g. uname -a):
Linux ai-sz-k8s-worker-node-1 3.10.0-1160.el7.x86_64 #1 SMP Mon Oct 19 16:18:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  • Others:

brness avatar Apr 01 '22 08:04 brness

I also encountered the same issue in GKE. My pod error complains the profile is not in the path. However, I ssh and find the file is in the correct path. For some reasons, the pod does not pick it up.

Events:
  Type     Reason          Age   From                                   Message
  ----     ------          ----  ----                                   -------
  Normal   Scheduled       5s    gke.io/optimize-utilization-scheduler  Successfully assigned default/ubuntu to gke-production-terraform-202203270619-198ef01d-14fv
  Normal   Pulling         3s    kubelet                                Pulling image "ubuntu"
  Normal   Pulled          2s    kubelet                                Successfully pulled image "ubuntu" in 292.343927ms
  Warning  Failed          2s    kubelet                                Error: failed to generate security options for container "ubuntu": failed to generate seccomp security options for container: cannot load seccomp profile "/var/lib/kubelet/seccomp/operator/seccomp/profile-allow-unsafe": open /var/lib/kubelet/seccomp/operator/seccomp/profile-allow-unsafe: no such file or directory
  Normal   SandboxChanged  2s    kubelet                                Pod sandbox changed, it will be killed and re-created.

Check in my GKE node.

gke-production-nap-e2-highcpu-2-1ua8w-43e7ef2e-n6xv /var/lib/kubelet/seccomp/operator/seccomp # ls
profile-allow-unsafe.json  profile-complain-block-high-risk.json
profile-block-all.json     profile-complain-unsafe.json

shawnho1018 avatar Apr 04 '22 15:04 shawnho1018

We recently ran into something similar and it turned out that the problem was 64 character limit imposed on finalizers. The tell-tale sign was:

ontroller.go:317] controller/profile "msg"="Reconciler error" "error"="cannot ensure node status: cannot create finalizer for nginx-1.19.1: wait on retry: retry function: SeccompProfile.security-profiles-operator.x-k8s.io \"nginx-1.19.1\" is invalid: metadata.finalizers: Invalid value: \"$REDACTED.internal-delete\": name part must be no more than 63 characters" "name"="nginx-1.19.1" 

where $REDACTED was the node name which itself was over 50 characters long, plus the -delete suffix.

jhrozek avatar May 05 '22 10:05 jhrozek

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Aug 03 '22 10:08 k8s-triage-robot

change the file path of the seccompProfile would help. I set the OperatorRoot path from /var/lib/security-profile-operator to /var/lib/kubelet/security-profiles-operator , and you need to change the volume as well, and I can set the seccompProfile into the pod!!

brness avatar Aug 24 '22 02:08 brness