security-profiles-operator
security-profiles-operator copied to clipboard
bpf-recorder is not valid for spod pods
What happened:
After installing SPO, when the verification of the bpf-recorder for its up and running is done, it shows an error: container bpf-recorder is not valid for the spod pod. When I try to enable it by patching the spod configuration, all pods of spod have been crashed without able to successfully restart.
What you expected to happen:
The bpf-recorder is up and running.
How to reproduce it (as minimally and precisely as possible):
root@k8s-master:~# kubectl get pods -n security-profiles-operator
NAME READY STATUS RESTARTS AGE
security-profiles-operator-8588b78997-4p2zv 1/1 Running 0 59s
security-profiles-operator-8588b78997-nlvxf 1/1 Running 0 59s
security-profiles-operator-8588b78997-wctnn 1/1 Running 0 59s
security-profiles-operator-webhook-8476cd6f8c-d7qqb 1/1 Running 0 56s
security-profiles-operator-webhook-8476cd6f8c-f2zqs 1/1 Running 0 56s
security-profiles-operator-webhook-8476cd6f8c-vq6z2 1/1 Running 0 56s
spod-2vkz6 2/2 Running 0 56s
spod-g9d2m 2/2 Running 0 56s
spod-kd4b6 2/2 Running 0 56s
root@k8s-master:~# kubectl -n security-profiles-operator logs --selector name=spod -c bpf-recorder
error: container bpf-recorder is not valid for pod spod-2vkz6
root@k8s-master:~# kubectl -n security-profiles-operator patch spod spod --type=merge -p '{"spec":{"enableBpfRecorder":true}}'
securityprofilesoperatordaemon.security-profiles-operator.x-k8s.io/spod patched
root@k8s-master:~# kubectl get pods -n security-profiles-operator
NAME READY STATUS RESTARTS AGE
security-profiles-operator-8588b78997-4p2zv 1/1 Running 0 22m
security-profiles-operator-8588b78997-nlvxf 1/1 Running 0 22m
security-profiles-operator-8588b78997-wctnn 1/1 Running 0 22m
security-profiles-operator-webhook-8476cd6f8c-d7qqb 1/1 Running 0 21m
security-profiles-operator-webhook-8476cd6f8c-f2zqs 1/1 Running 0 21m
security-profiles-operator-webhook-8476cd6f8c-vq6z2 1/1 Running 0 21m
spod-28qd6 2/3 CrashLoopBackOff 7 (5m3s ago) 16m
spod-2msmj 2/3 Error 8 (5m6s ago) 16m
spod-rp2vz 2/3 CrashLoopBackOff 7 (4m50s ago) 16m
Anything else we need to know?:
Environment:
- Cloud provider or hardware configuration: VM nodes
- OS (e.g:
cat /etc/os-release
): NAME="Ubuntu", VERSION="20.04.6 LTS - Kernel (e.g.
uname -a
): 5.4.0-156-generic - Others:
Hey @shaojini, thank you for the report. CAn you extract the crash logs of the spod instances, like spod-28qd6
?
root@k8s-master:~# kubectl -n security-profiles-operator logs spod-6579n
Defaulted container "security-profiles-operator" out of: security-profiles-operator, bpf-recorder, metrics, non-root-enabler (init)
I0817 12:45:07.665328 1346435 main.go:260] "msg"="Set logging verbosity to 0"
I0817 12:45:07.666835 1346435 main.go:266] "msg"="Profiling support enabled: false"
I0817 12:45:07.667151 1346435 main.go:286] setup "msg"="starting component: spod" "buildDate"="1980-01-01T00:00:00Z" "buildTags"="netgo,osusergo,seccomp,apparmor" "cgoldFlags"="unknown" "compiler"="gc" "dependencies"="cloud.google.com/go/compute/metadata v0.2.3 ,cuelang.org/go v0.5.0 ,filippo.io/edwards25519 v1.0.0 ,github.com/AliyunContainerService/ack-ram-tool/pkg/credentials/alibabacloudsdkgo/helper v0.2.0 ,github.com/Azure/azure-sdk-for-go v68.0.0+incompatible ,github.com/Azure/go-autorest/autorest v0.11.29 ,github.com/Azure/go-autorest/autorest/adal v0.9.22 ,github.com/Azure/go-autorest/autorest/azure/auth v0.5.12 ,github.com/Azure/go-autorest/autorest/azure/cli v0.4.6 ,github.com/Azure/go-autorest/autorest/date v0.3.0 ,github.com/Azure/go-autorest/logger v0.2.1 ,github.com/Azure/go-autorest/tracing v0.6.0 ,github.com/OneOfOne/xxhash v1.2.8 ,github.com/ProtonMail/go-crypto v0.0.0-20230518184743-7afd39499903 ,github.com/acobaugh/osrelease v0.1.0 ,github.com/agnivade/levenshtein v1.1.1 ,github.com/alibabacloud-go/alibabacloud-gateway-spi v0.0.4 ,github.com/alibabacloud-go/cr-20160607 v1.0.1 ,github.com/alibabacloud-go/cr-20181201 v1.0.10 ,github.com/alibabacloud-go/darabonba-openapi v0.1.18 ,github.com/alibabacloud-go/debug v0.0.0-20190504072949-9472017b5c68 ,github.com/alibabacloud-go/endpoint-util v1.1.1 ,github.com/alibabacloud-go/openapi-util v0.0.11 ,github.com/alibabacloud-go/tea v1.1.18 ,github.com/alibabacloud-go/tea-utils v1.4.4 ,github.com/alibabacloud-go/tea-xml v1.1.2 ,github.com/aliyun/credentials-go v1.2.3 ,github.com/aquasecurity/libbpfgo v0.4.9-libbpf-1.2.0 ,github.com/asaskevich/govalidator v0.0.0-20230301143203-a9d515a09cc2 ,github.com/aws/aws-sdk-go-v2 v1.18.1 ,github.com/aws/aws-sdk-go-v2/config v1.18.27 ,github.com/aws/aws-sdk-go-v2/credentials v1.13.26 ,github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.13.4 ,github.com/aws/aws-sdk-go-v2/internal/configsources v1.1.34 ,github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.4.28 ,github.com/aws/aws-sdk-go-v2/internal/ini v1.3.35 ,github.com/aws/aws-sdk-go-v2/service/ecr v1.15.0 ,github.com/aws/aws-sdk-go-v2/service/ecrpublic v1.12.0 ,github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.9.28 ,github.com/aws/aws-sdk-go-v2/service/sso v1.12.12 ,github.com/aws/aws-sdk-go-v2/service/ssooidc v1.14.12 ,github.com/aws/aws-sdk-go-v2/service/sts v1.19.2 ,github.com/aws/smithy-go v1.13.5 ,github.com/awslabs/amazon-ecr-credential-helper/ecr-login v0.0.0-20220228164355-396b2034c795 ,github.com/beorn7/perks v1.0.1 ,github.com/blang/semver v3.5.1+incompatible ,github.com/blang/semver/v4 v4.0.0 ,github.com/buildkite/agent/v3 v3.49.0 ,github.com/cert-manager/cert-manager v1.12.3 ,github.com/cespare/xxhash/v2 v2.2.0 ,github.com/chrismellard/docker-credential-acr-env v0.0.0-20220119192733-fe33c00cee21 ,github.com/clbanning/mxj/v2 v2.5.6 ,github.com/cloudflare/circl v1.3.3 ,github.com/cockroachdb/apd/v2 v2.0.2 ,github.com/common-nighthawk/go-figure v0.0.0-20210622060536-734e95fb86be ,github.com/containerd/stargz-snapshotter/estargz v0.14.3 ,github.com/containers/common v0.55.3 ,github.com/coreos/go-oidc/v3 v3.6.0 ,github.com/cpuguy83/go-md2man/v2 v2.0.2 ,github.com/cyberphone/json-canonicalization v0.0.0-20230514072755-504adb8a8af1 ,github.com/davecgh/go-spew v1.1.1 ,github.com/digitorus/pkcs7 v0.0.0-20221212123742-001c36b64ec3 ,github.com/digitorus/timestamp v0.0.0-20221019182153-ef3b63b79b31 ,github.com/dimchansky/utfbom v1.1.1 ,github.com/docker/cli v24.0.0+incompatible ,github.com/docker/distribution v2.8.2+incompatible ,github.com/docker/docker v24.0.2+incompatible ,github.com/docker/docker-credential-helpers v0.7.0 ,github.com/emicklei/go-restful/v3 v3.9.0 ,github.com/emicklei/proto v1.10.0 ,github.com/evanphx/json-patch/v5 v5.6.0 ,github.com/fsnotify/fsnotify v1.6.0 ,github.com/gabriel-vasile/mimetype v1.4.2 ,github.com/ghodss/yaml v1.0.0 ,github.com/go-chi/chi v4.1.2+incompatible ,github.com/go-jose/go-jose/v3 v3.0.0 ,github.com/go-logr/logr v1.2.4 ,github.com/go-logr/stdr v1.2.2 ,github.com/go-openapi/analysis v0.21.4 ,github.com/go-openapi/errors v0.20.3 ,github.com/go-openapi/jsonpointer v0.19.6 ,github.com/go-openapi/jsonreference v0.20.2 ,github.com/go-openapi/loads v0.21.2 ,github.com/go-openapi/runtime v0.26.0 ,github.com/go-openapi/spec v0.20.9 ,github.com/go-openapi/strfmt v0.21.7 ,github.com/go-openapi/swag v0.22.4 ,github.com/go-openapi/validate v0.22.1 ,github.com/go-playground/locales v0.14.1 ,github.com/go-playground/universal-translator v0.18.1 ,github.com/go-playground/validator/v10 v10.14.0 ,github.com/gobwas/glob v0.2.3 ,github.com/gogo/protobuf v1.3.2 ,github.com/golang-jwt/jwt/v4 v4.5.0 ,github.com/golang/glog v1.1.0 ,github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da ,github.com/golang/protobuf v1.5.3 ,github.com/golang/snappy v0.0.4 ,github.com/google/certificate-transparency-go v1.1.6 ,github.com/google/gnostic-models v0.6.8 ,github.com/google/go-cmp v0.5.9 ,github.com/google/go-containerregistry v0.16.1 ,github.com/google/go-github/v50 v50.2.0 ,github.com/google/go-querystring v1.1.0 ,github.com/google/gofuzz v1.2.0 ,github.com/google/s2a-go v0.1.4 ,github.com/google/uuid v1.3.0 ,github.com/googleapis/enterprise-certificate-proxy v0.2.4 ,github.com/hashicorp/go-cleanhttp v0.5.2 ,github.com/hashicorp/go-retryablehttp v0.7.2 ,github.com/hashicorp/hcl v1.0.0 ,github.com/imdario/mergo v0.3.16 ,github.com/in-toto/in-toto-golang v0.9.0 ,github.com/jedisct1/go-minisign v0.0.0-20211028175153-1c139d1cc84b ,github.com/jellydator/ttlcache/v3 v3.0.1 ,github.com/jmespath/go-jmespath v0.4.0 ,github.com/josharian/intern v1.0.0 ,github.com/json-iterator/go v1.1.12 ,github.com/klauspost/compress v1.16.6 ,github.com/leodido/go-urn v1.2.4 ,github.com/letsencrypt/boulder v0.0.0-20230213213521-fdfea0d469b6 ,github.com/magiconair/properties v1.8.7 ,github.com/mailru/easyjson v0.7.7 ,github.com/matttproud/golang_protobuf_extensions v1.0.4 ,github.com/mitchellh/go-homedir v1.1.0 ,github.com/mitchellh/go-wordwrap v1.0.1 ,github.com/mitchellh/mapstructure v1.5.0 ,github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd ,github.com/modern-go/reflect2 v1.0.2 ,github.com/mozillazg/docker-credential-acr-helper v0.3.0 ,github.com/mpvl/unique v0.0.0-20150818121801-cbe035fff7de ,github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 ,github.com/nozzle/throttler v0.0.0-20180817012639-2ea982251481 ,github.com/nxadm/tail v1.4.8 ,github.com/oklog/ulid v1.3.1 ,github.com/open-policy-agent/opa v0.52.0 ,github.com/opencontainers/go-digest v1.0.0 ,github.com/opencontainers/image-spec v1.1.0-rc4 ,github.com/opencontainers/runtime-spec v1.1.0 ,github.com/openshift/api v0.0.0-20221205111557-f2fbb1d1cd5e ,github.com/opentracing/opentracing-go v1.2.0 ,github.com/pborman/uuid v1.2.1 ,github.com/pelletier/go-toml/v2 v2.0.8 ,github.com/pjbgf/go-apparmor v0.1.2 ,github.com/pkg/errors v0.9.1 ,github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring v0.67.1 ,github.com/prometheus/client_golang v1.16.0 ,github.com/prometheus/client_model v0.4.0 ,github.com/prometheus/common v0.42.0 ,github.com/prometheus/procfs v0.10.1 ,github.com/protocolbuffers/txtpbfmt v0.0.0-20220428173112-74888fd59c2b ,github.com/rcrowley/go-metrics v0.0.0-20201227073835-cf1acfcdf475 ,github.com/russross/blackfriday/v2 v2.1.0 ,github.com/sassoftware/relic v7.2.1+incompatible ,github.com/seccomp/libseccomp-golang v0.10.0 ,github.com/secure-systems-lab/go-securesystemslib v0.6.0 ,github.com/segmentio/ksuid v1.0.4 ,github.com/shibumi/go-pathspec v1.3.0 ,github.com/sigstore/cosign/v2 v2.1.1 ,github.com/sigstore/fulcio v1.3.1 ,github.com/sigstore/rekor v1.2.2-0.20230601122533-4c81ff246d12 ,github.com/sigstore/sigstore v1.7.1 ,github.com/sigstore/timestamp-authority v1.1.1 ,github.com/sirupsen/logrus v1.9.3 ,github.com/skratchdot/open-golang v0.0.0-20200116055534-eef842397966 ,github.com/spf13/afero v1.9.5 ,github.com/spf13/cast v1.5.1 ,github.com/spf13/cobra v1.7.0 ,github.com/spf13/jwalterweatherman v1.1.0 ,github.com/spf13/pflag v1.0.5 ,github.com/spf13/viper v1.16.0 ,github.com/spiffe/go-spiffe/v2 v2.1.6 ,github.com/subosito/gotenv v1.4.2 ,github.com/syndtr/goleveldb v1.0.1-0.20220721030215-126854af5e6d ,github.com/tchap/go-patricia/v2 v2.3.1 ,github.com/theupdateframework/go-tuf v0.5.2 ,github.com/titanous/rocacheck v0.0.0-20171023193734-afe73141d399 ,github.com/tjfoc/gmsm v1.3.2 ,github.com/transparency-dev/merkle v0.0.2 ,github.com/urfave/cli/v2 v2.25.7 ,github.com/vbatts/tar-split v0.11.3 ,github.com/xanzy/go-gitlab v0.86.0 ,github.com/xeipuuv/gojsonpointer v0.0.0-20190905194746-02993c407bfb ,github.com/xeipuuv/gojsonreference v0.0.0-20180127040603-bd5ef7bd5415 ,github.com/xrash/smetrics v0.0.0-20201216005158-039620a65673 ,github.com/yashtewari/glob-intersection v0.1.0 ,github.com/zeebo/errs v1.3.0 ,go.mongodb.org/mongo-driver v1.11.3 ,go.opencensus.io v0.24.0 ,go.opentelemetry.io/otel v1.16.0 ,go.opentelemetry.io/otel/metric v1.16.0 ,go.opentelemetry.io/otel/trace v1.16.0 ,go.step.sm/crypto v0.32.1 ,go.uber.org/atomic v1.10.0 ,go.uber.org/multierr v1.11.0 ,go.uber.org/zap v1.24.0 ,golang.org/x/crypto v0.12.0 ,golang.org/x/exp v0.0.0-20230522175609-2e198f4a06a1 ,golang.org/x/mod v0.12.0 ,golang.org/x/net v0.14.0 ,golang.org/x/oauth2 v0.9.0 ,golang.org/x/sync v0.3.0 ,golang.org/x/sys v0.11.0 ,golang.org/x/term v0.11.0 ,golang.org/x/text v0.12.0 ,golang.org/x/time v0.3.0 ,gomodules.xyz/jsonpatch/v2 v2.3.0 ,google.golang.org/api v0.128.0 ,google.golang.org/appengine v1.6.7 ,google.golang.org/genproto/googleapis/rpc v0.0.0-20230530153820-e85fd2cbaebc ,google.golang.org/grpc v1.57.0 ,google.golang.org/protobuf v1.31.0 ,gopkg.in/go-jose/go-jose.v2 v2.6.1 ,gopkg.in/inf.v0 v0.9.1 ,gopkg.in/ini.v1 v1.67.0 ,gopkg.in/square/go-jose.v2 v2.6.0 ,gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7 ,gopkg.in/yaml.v2 v2.4.0 ,gopkg.in/yaml.v3 v3.0.1 ,k8s.io/api v0.28.0 ,k8s.io/apiextensions-apiserver v0.27.2 ,k8s.io/apimachinery v0.28.0 ,k8s.io/client-go v0.28.0 ,k8s.io/component-base v0.27.2 ,k8s.io/klog/v2 v2.100.1 ,k8s.io/kube-openapi v0.0.0-20230717233707-2695361300d9 ,k8s.io/utils v0.0.0-20230505201702-9f6742963106 ,oras.land/oras-go/v2 v2.2.1 ,sigs.k8s.io/controller-runtime v0.15.1 ,sigs.k8s.io/gateway-api v0.7.0 ,sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd ,sigs.k8s.io/release-utils v0.7.4 ,sigs.k8s.io/structured-merge-diff/v4 v4.2.3 ,sigs.k8s.io/yaml v1.3.0 " "gitCommit"="6d51dc8d1bdae339b47facd5c9b8a0e884c30ff8" "gitCommitDate"="2023-08-17T07:36:21Z" "gitTreeState"="clean" "goVersion"="go1.20.4" "ldFlags"="unknown" "libbpf"="v1.2" "libseccomp"="2.5.4" "platform"="linux/amd64" "version"="0.8.1-dev"
I0817 12:45:07.667702 1346435 main.go:365] setup "msg"="watching all namespaces"
I0817 12:45:07.668061 1346435 listener.go:44] controller-runtime/metrics "msg"="Metrics server is starting to listen" "addr"=":8080"
I0817 12:45:07.668442 1346435 metrics.go:217] metrics "msg"="Registering metric: seccomp_profile_error_total"
I0817 12:45:07.668482 1346435 metrics.go:217] metrics "msg"="Registering metric: selinux_profile_audit_total"
I0817 12:45:07.668497 1346435 metrics.go:217] metrics "msg"="Registering metric: apparmor_profile_total"
I0817 12:45:07.668504 1346435 metrics.go:217] metrics "msg"="Registering metric: apparmor_profile_audit_total"
I0817 12:45:07.668512 1346435 metrics.go:217] metrics "msg"="Registering metric: seccomp_profile_total"
I0817 12:45:07.668524 1346435 metrics.go:217] metrics "msg"="Registering metric: seccomp_profile_bpf_total"
I0817 12:45:07.668531 1346435 metrics.go:217] metrics "msg"="Registering metric: selinux_profile_error_total"
I0817 12:45:07.668539 1346435 metrics.go:217] metrics "msg"="Registering metric: apparmor_profile_error_total"
I0817 12:45:07.668546 1346435 metrics.go:217] metrics "msg"="Registering metric: seccomp_profile_audit_total"
I0817 12:45:07.668553 1346435 metrics.go:217] metrics "msg"="Registering metric: selinux_profile_total"
I0817 12:45:07.669531 1346435 grpc.go:60] metrics "msg"="Starting GRPC server API"
I0817 12:45:07.707643 1346435 profilerecorder.go:144] recorder-spod "msg"="Setting up profile recorder" "Node"="192.168.0.11"
I0817 12:45:07.707706 1346435 main.go:486] setup "msg"="starting daemon"
I0817 12:45:07.707891 1346435 server.go:50] "msg"="starting server" "addr"={"IP":"::","Port":8080,"Zone":""} "kind"="metrics" "path"="/metrics"
I0817 12:45:07.707968 1346435 internal.go:360] "msg"="Starting server" "addr"={"IP":"::","Port":8085,"Zone":""} "kind"="health probe"
I0817 12:45:07.708018 1346435 controller.go:177] "msg"="Starting EventSource" "controller"="profile" "controllerGroup"="security-profiles-operator.x-k8s.io" "controllerKind"="SeccompProfile" "source"="kind source: *v1beta1.SeccompProfile"
I0817 12:45:07.708044 1346435 controller.go:177] "msg"="Starting EventSource" "controller"="profile" "controllerGroup"="security-profiles-operator.x-k8s.io" "controllerKind"="SeccompProfile" "source"="kind source: *v1alpha1.SecurityProfilesOperatorDaemon"
I0817 12:45:07.708060 1346435 controller.go:185] "msg"="Starting Controller" "controller"="profile" "controllerGroup"="security-profiles-operator.x-k8s.io" "controllerKind"="SeccompProfile"
I0817 12:45:07.708061 1346435 controller.go:177] "msg"="Starting EventSource" "controller"="profilerecorder" "controllerGroup"="" "controllerKind"="Pod" "source"="kind source: *v1.Pod"
I0817 12:45:07.708076 1346435 controller.go:185] "msg"="Starting Controller" "controller"="profilerecorder" "controllerGroup"="" "controllerKind"="Pod"
I0817 12:45:07.899659 1346435 controller.go:219] "msg"="Starting workers" "controller"="profile" "controllerGroup"="security-profiles-operator.x-k8s.io" "controllerKind"="SeccompProfile" "worker count"=1
I0817 12:45:07.941865 1346435 controller.go:219] "msg"="Starting workers" "controller"="profilerecorder" "controllerGroup"="" "controllerKind"="Pod" "worker count"=1
Hi, @saschagrunert .
Any comment for this issue? Thanks.
@shaojini we need to find out why the pod has been crashed, while the logs on https://github.com/kubernetes-sigs/security-profiles-operator/issues/1837#issuecomment-1682228753 do not indicate any crash at all. Do you have the logs of the crashing pod somehow available?
Hi, @saschagrunert .
I have tried uninstalled and re-installed a few time, but the problem is the same. The logs given previously is taken from one of those tries (Before reporting the issue, I have tried at least twice to confirm it). The restart of pods may be normal because the patching on those spod pods has been done (the name of spods has been changed). However, the restart of "crash" pods can not been done successfully (describe of spod can find some information?):
root@k8s-master:~# kubectl get pods -n security-profiles-operator
NAME READY STATUS RESTARTS AGE
security-profiles-operator-8588b78997-8cm8z 1/1 Running 0 17h
security-profiles-operator-8588b78997-9rg9j 1/1 Running 0 17h
security-profiles-operator-8588b78997-csrhk 1/1 Running 0 17h
security-profiles-operator-webhook-8476cd6f8c-g9m5v 1/1 Running 0 17h
security-profiles-operator-webhook-8476cd6f8c-nh57n 1/1 Running 0 17h
security-profiles-operator-webhook-8476cd6f8c-qzpk5 1/1 Running 0 17h
spod-lbcnc 3/3 Running 0 17h
spod-t5vf6 3/3 Running 0 17h
spod-wg9w7 3/3 Running 0 17h
root@k8s-master:~# kubectl -n security-profiles-operator logs --selector name=spod -c bpf-recorder
error: container bpf-recorder is not valid for pod spod-lbcnc
root@k8s-master:~# kubectl -n security-profiles-operator patch spod spod --type=merge -p '{"spec":{"enableBpfRecorder":true}}'
securityprofilesoperatordaemon.security-profiles-operator.x-k8s.io/spod patched
root@k8s-master:~# kubectl get pods -n security-profiles-operator
NAME READY STATUS RESTARTS AGE
security-profiles-operator-8588b78997-8cm8z 1/1 Running 0 17h
security-profiles-operator-8588b78997-9rg9j 1/1 Running 0 17h
security-profiles-operator-8588b78997-csrhk 1/1 Running 0 17h
security-profiles-operator-webhook-8476cd6f8c-g9m5v 1/1 Running 0 17h
security-profiles-operator-webhook-8476cd6f8c-nh57n 1/1 Running 0 17h
security-profiles-operator-webhook-8476cd6f8c-qzpk5 1/1 Running 0 17h
spod-ppm5q 3/4 CrashLoopBackOff 5 (54s ago) 4m12s
spod-xn6wt 3/4 CrashLoopBackOff 5 (62s ago) 4m11s
spod-xwkft 3/4 CrashLoopBackOff 5 (49s ago) 4m11s
root@k8s-master:~# kubectl describe -n security-profiles-operator pod spod-ppm5q
Name: spod-ppm5q
Namespace: security-profiles-operator
Priority: 2000001000
Priority Class Name: system-node-critical
Service Account: spod
Node: k8s-worker3/192.168.0.11
Start Time: Tue, 22 Aug 2023 11:36:36 +0300
Labels: app=security-profiles-operator
controller-revision-hash=6f798fcbb9
name=spod
pod-template-generation=3
Annotations: openshift.io/scc: privileged
Status: Running
SeccompProfile: RuntimeDefault
IP: 10.0.0.99
IPs:
IP: 10.0.0.99
Controlled By: DaemonSet/spod
Init Containers:
non-root-enabler:
Container ID: cri-o://5f1ec4f1c35b36ee0e940fb6d73e553a15bbc5ea1483f18c246aecc549236ebc
Image: gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest
Image ID: gcr.io/k8s-staging-sp-operator/security-profiles-operator@sha256:40f98b564084d46acac519a515032e5602b6eec480d221771053c96f4057811d
Port: <none>
Host Port: <none>
Args:
non-root-enabler
--runtime=cri-o
State: Terminated
Reason: Completed
Exit Code: 0
Started: Tue, 22 Aug 2023 11:36:44 +0300
Finished: Tue, 22 Aug 2023 11:36:44 +0300
Ready: True
Restart Count: 0
Limits:
ephemeral-storage: 50Mi
memory: 64Mi
Requests:
cpu: 100m
ephemeral-storage: 10Mi
memory: 32Mi
Environment:
NODE_NAME: (v1:spec.nodeName)
KUBELET_DIR: /var/lib/kubelet
SPO_VERBOSITY: 0
Mounts:
/host from host-root-volume (rw)
/opt/spo-profiles from operator-profiles-volume (ro)
/var/lib from host-varlib-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-g8x8b (ro)
/var/run/secrets/metrics from metrics-cert-volume (rw)
Containers:
security-profiles-operator:
Container ID: cri-o://b8f632a1fcfc144f57b6042ba44fa77ecbb5399163d77fe6a1ffb8529bf7fab8
Image: gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest
Image ID: gcr.io/k8s-staging-sp-operator/security-profiles-operator@sha256:40f98b564084d46acac519a515032e5602b6eec480d221771053c96f4057811d
Port: 8085/TCP
Host Port: 0/TCP
SeccompProfile: Localhost
LocalhostProfile: security-profiles-operator.json
Args:
daemon
--with-recording=true
State: Running
Started: Tue, 22 Aug 2023 11:36:47 +0300
Ready: True
Restart Count: 0
Limits:
ephemeral-storage: 200Mi
memory: 128Mi
Requests:
cpu: 100m
ephemeral-storage: 50Mi
memory: 64Mi
Liveness: http-get http://:liveness-port/healthz delay=0s timeout=1s period=10s #success=1 #failure=1
Startup: http-get http://:liveness-port/healthz delay=0s timeout=1s period=3s #success=1 #failure=10
Environment:
NODE_NAME: (v1:spec.nodeName)
OPERATOR_NAMESPACE: security-profiles-operator (v1:metadata.namespace)
SPOD_NAME: spod
KUBELET_DIR: /var/lib/kubelet
HOME: /home
ENABLE_LOG_ENRICHER: false
ENABLE_BPF_RECORDER: false
SPO_VERBOSITY: 0
Mounts:
/etc/selinux.d from selinux-drop-dir (rw)
/home from home-volume (rw)
/tmp from tmp-volume (rw)
/tmp/security-profiles-operator-recordings from profile-recording-output-volume (rw)
/var/lib/kubelet/seccomp/operator from host-operator-volume (rw)
/var/run/grpc from grpc-server-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-g8x8b (ro)
/var/run/selinuxd from selinuxd-private-volume (rw)
log-enricher:
Container ID: cri-o://33b914d4e40dc990d51b7b19e6a7470b3446b90a6786f1a39764d2ec7ca2630e
Image: gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest
Image ID: gcr.io/k8s-staging-sp-operator/security-profiles-operator@sha256:40f98b564084d46acac519a515032e5602b6eec480d221771053c96f4057811d
Port: <none>
Host Port: <none>
Args:
log-enricher
State: Running
Started: Tue, 22 Aug 2023 11:36:48 +0300
Ready: True
Restart Count: 0
Limits:
ephemeral-storage: 128Mi
memory: 256Mi
Requests:
cpu: 50m
ephemeral-storage: 10Mi
memory: 64Mi
Environment:
NODE_NAME: (v1:spec.nodeName)
KUBELET_DIR: /var/lib/kubelet
SPO_VERBOSITY: 0
Mounts:
/var/log from host-syslog-volume (ro)
/var/log/audit from host-auditlog-volume (ro)
/var/run/grpc from grpc-server-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-g8x8b (ro)
bpf-recorder:
Container ID: cri-o://1f26b655d092359cbd22fc47a72a6f065b0faee1cd92f44b542a5bb129241f62
Image: gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest
Image ID: gcr.io/k8s-staging-sp-operator/security-profiles-operator@sha256:40f98b564084d46acac519a515032e5602b6eec480d221771053c96f4057811d
Port: <none>
Host Port: <none>
Args:
bpf-recorder
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 22 Aug 2023 11:42:37 +0300
Finished: Tue, 22 Aug 2023 11:42:37 +0300
Ready: False
Restart Count: 6
Limits:
ephemeral-storage: 20Mi
memory: 128Mi
Requests:
cpu: 50m
ephemeral-storage: 10Mi
memory: 64Mi
Environment:
NODE_NAME: (v1:spec.nodeName)
KUBELET_DIR: /var/lib/kubelet
SPO_VERBOSITY: 0
Mounts:
/etc/os-release from host-etc-osrelease-volume (rw)
/sys/kernel/debug from sys-kernel-debug-volume (ro)
/tmp from tmp-volume (rw)
/var/run/grpc from grpc-server-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-g8x8b (ro)
metrics:
Container ID: cri-o://795ced755723f826f0fbb7cb80584fc4e5e0a777340a418102e7c55c9b6f3519
Image: gcr.io/kubebuilder/kube-rbac-proxy:v0.14.1
Image ID: gcr.io/kubebuilder/kube-rbac-proxy@sha256:928e64203edad8f1bba23593c7be04f0f8410c6e4feb98d9e9c2d00a8ff59048
Port: 9443/TCP
Host Port: 0/TCP
Args:
--secure-listen-address=0.0.0.0:9443
--upstream=http://127.0.0.1:8080
--v=10
--tls-cert-file=/var/run/secrets/metrics/tls.crt
--tls-private-key-file=/var/run/secrets/metrics/tls.key
State: Running
Started: Tue, 22 Aug 2023 11:36:50 +0300
Ready: True
Restart Count: 0
Limits:
ephemeral-storage: 20Mi
memory: 128Mi
Requests:
cpu: 50m
ephemeral-storage: 10Mi
memory: 32Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-g8x8b (ro)
/var/run/secrets/metrics from metrics-cert-volume (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
host-varlib-volume:
Type: HostPath (bare host directory volume)
Path: /var/lib
HostPathType: Directory
host-operator-volume:
Type: HostPath (bare host directory volume)
Path: /var/lib/security-profiles-operator
HostPathType: DirectoryOrCreate
operator-profiles-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: security-profiles-operator-profile
Optional: false
selinux-drop-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
selinuxd-private-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
host-fsselinux-volume:
Type: HostPath (bare host directory volume)
Path: /sys/fs/selinux
HostPathType: Directory
host-etcselinux-volume:
Type: HostPath (bare host directory volume)
Path: /etc/selinux
HostPathType: Directory
host-varlibselinux-volume:
Type: HostPath (bare host directory volume)
Path: /var/lib/selinux
HostPathType: Directory
profile-recording-output-volume:
Type: HostPath (bare host directory volume)
Path: /tmp/security-profiles-operator-recordings
HostPathType: DirectoryOrCreate
host-auditlog-volume:
Type: HostPath (bare host directory volume)
Path: /var/log/audit
HostPathType: DirectoryOrCreate
host-syslog-volume:
Type: HostPath (bare host directory volume)
Path: /var/log
HostPathType: DirectoryOrCreate
metrics-cert-volume:
Type: Secret (a volume populated by a Secret)
SecretName: metrics-server-cert
Optional: false
sys-kernel-debug-volume:
Type: HostPath (bare host directory volume)
Path: /sys/kernel/debug
HostPathType: Directory
host-etc-osrelease-volume:
Type: HostPath (bare host directory volume)
Path: /etc/os-release
HostPathType: File
tmp-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
grpc-server-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
host-root-volume:
Type: HostPath (bare host directory volume)
Path: /
HostPathType: Directory
home-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-g8x8b:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 6m25s default-scheduler Successfully assigned security-profiles-operator/spod-ppm5q to k8s-worker3
Normal Pulling 6m25s kubelet Pulling image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest"
Normal Pulled 6m18s kubelet Successfully pulled image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest" in 6.98679362s (6.986814507s including waiting)
Normal Created 6m18s kubelet Created container non-root-enabler
Normal Started 6m18s kubelet Started container non-root-enabler
Normal Pulling 6m17s kubelet Pulling image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest"
Normal Pulled 6m16s kubelet Successfully pulled image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest" in 987.480618ms (987.493453ms including waiting)
Normal Created 6m15s kubelet Created container security-profiles-operator
Normal Started 6m15s kubelet Started container security-profiles-operator
Normal Pulling 6m15s kubelet Pulling image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest"
Normal Pulled 6m14s kubelet Successfully pulled image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest" in 1.266629881s (1.266641343s including waiting)
Normal Created 6m14s kubelet Created container log-enricher
Normal Started 6m14s kubelet Started container log-enricher
Normal Pulled 6m13s kubelet Successfully pulled image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest" in 827.998943ms (828.022247ms including waiting)
Normal Pulled 6m13s kubelet Container image "gcr.io/kubebuilder/kube-rbac-proxy:v0.14.1" already present on machine
Normal Pulling 6m12s (x2 over 6m14s) kubelet Pulling image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest"
Normal Created 6m12s kubelet Created container metrics
Normal Started 6m12s kubelet Started container metrics
Normal Created 6m11s (x2 over 6m13s) kubelet Created container bpf-recorder
Normal Pulled 6m11s kubelet Successfully pulled image "gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest" in 1.006150487s (1.006230461s including waiting)
Normal Started 6m10s (x2 over 6m13s) kubelet Started container bpf-recorder
Warning BackOff 77s (x25 over 6m10s) kubelet Back-off restarting failed container bpf-recorder in pod spod-ppm5q_security-profiles-operator(8f1d734d-2b89-47e5-b74a-fd0bd996473f)
root@k8s-master:~#
@shaojini you can see that the bpf-recorder has the container id 1f26b655d092359cbd22fc47a72a6f065b0faee1cd92f44b542a5bb129241f62
from kubectl describe
. May I ask you to access the node and run something like sudo crictl logs <ID>
to get the logs of the crashing container?
Hi, @saschagrunert .
I have re-operated the issue again for comparing the "describe pod spod-xxxx" before and after the patching. The difference is to enable "recording" in the "daemon" and create one extra container of "bpf-recorder" in the spod. In addition, the Container IDs (cri-o) of security-profiles-operator and metric have been changed:
From the logs of the node (re-installing has been done), the error is the "container ID does not exist":
root@k8s-worker3:~# sudo crictl logs 723ec7f3c3c798953fa217bed812ac179352bbe11a27597d6568458ad41efe9e
E0822 14:57:23.045998 1389099 remote_runtime.go:415] "ContainerStatus from runtime service failed" err="rpc error: code = NotFound desc = could not find container \"723ec7f3c3c798953fa217bed812ac179352bbe11a27597d6568458ad41efe9e\": container with ID starting with 723ec7f3c3c798953fa217bed812ac179352bbe11a27597d6568458ad41efe9e not found: ID does not exist" containerID="723ec7f3c3c798953fa217bed812ac179352bbe11a27597d6568458ad41efe9e"
FATA[0000] rpc error: code = NotFound desc = could not find container "723ec7f3c3c798953fa217bed812ac179352bbe11a27597d6568458ad41efe9e": container with ID starting with 723ec7f3c3c798953fa217bed812ac179352bbe11a27597d6568458ad41efe9e not found: ID does not exist
Hi @saschagrunert .
That ID in the "describe" is not the actual container id. I got the ID in this way:
root@k8s-worker3:~# crictl ps -a
CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD
a8fefdb4a1664 gcr.io/k8s-staging-sp-operator/security-profiles-operator@sha256:40f98b564084d46acac519a515032e5602b6eec480d221771053c96f4057811d 2 minutes ago Exited bpf-recorder 53 70487bf0e53cc spod-2v9z7
Then to get the error from the logs of the container ID:
root@k8s-worker3:~# sudo crictl logs a8fefdb4a1664
..............................................................................
E0822 16:01:38.078963 1543597 main.go:235] setup "msg"="running security-profiles-operator" "error"="connect to metrics server: connect to local GRPC server: wait on retry: timed out waiting for the condition"
Because it tries to restart all time, the container id also changes all the time. Therefore, the logs of the ID can not be found after a few minutes.
@shaojini the previous logs of the container should be still available somehow, see kubectl logs --previous
. At least for a few minutes as you'd mentioned.
Hi, @saschagrunert.
The bef-recorder container log has shown the error: "error"="connect to metrics server: connect to local GRPC server: wait on retry: timed out waiting for the condition".
Is it the root cause for the unsuccessful staring-up of the bef-recorder container according to the logs? Do you have the same issue when you config the bef-recorder for recording?
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.