spire icon indicating copy to clipboard operation
spire copied to clipboard

Failed to collect all selectors for PID" error="workload attestor \"k8s\" failed: rpc error: code = DeadlineExceeded desc = workloadattestor(k8s): no selectors found after max poll attempts"

Open tanjunchen opened this issue 2 years ago • 26 comments

  • Version: 1.0.1

  • Platform: k8s + istio

  • Subsystem: spire-server:

apiVersion: v1
kind: Namespace
metadata:
  name: spire

---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: spire-server
  namespace: spire

---

apiVersion: v1
kind: Secret
metadata:
  name: spire-server
  namespace: spire
type: Opaque
data:
  bootstrap.key: LS0tLS1CRUdJTiBFQyBQUklWQVRFIEtFWS0tLS0tCk1JR2tBZ0VCQkRBZzJMYnVsWHpRWDFORisyRGkwUkt6TVdmRUdpb0JoaC9mRnB4N3lPRXFrYS8vVHBhZVUzTzUKUUpSWlhkV0hLdWFnQndZRks0RUVBQ0toWkFOaUFBUmFNSDZkSVpMRWhpTE9kdnpqRzdsWVlObVB6U2N2dGJWegpmTi9qeGFITFNacnRqdVlJRXJOOUNTdUFPQzRqaVBSbjdUKzBNZit2eUMwNjBzdXNpbTR6QlllaDdpOXRVRVcxCjdXK1BwZTNwWjRUeVZmQndLOHV6K1p5YTgrcFVyMk09Ci0tLS0tRU5EIEVDIFBSSVZBVEUgS0VZLS0tLS0K

---

apiVersion: v1
kind: ConfigMap
metadata:
  name: spire-server
  namespace: spire
data:
  server.conf: |
    server {
      bind_address = "0.0.0.0"
      bind_port = "8081"
      trust_domain = "example.org"
      data_dir = "/run/spire/data"
      log_level = "DEBUG"
      default_svid_ttl = "1h"
      ca_subject = {
        country = ["US"],
        organization = ["SPIFFE"],
        common_name = "",
      }
    }

    plugins {
      DataStore "sql" {
        plugin_data {
          database_type = "sqlite3"
          connection_string = "/run/spire/data/datastore.sqlite3"
        }
      }

      NodeAttestor "k8s_sat" {
        plugin_data {
          clusters = {
            "demo-cluster" = {
              use_token_review_api_validation = true
              service_account_whitelist = ["spire:spire-agent"]
            }
          }
        }
      }

      NodeResolver "noop" {
        plugin_data {}
      }

      KeyManager "disk" {
        plugin_data {
          keys_path = "/run/spire/data/keys.json"
        }
      }

      UpstreamAuthority "disk" {
        plugin_data {
          key_file_path = "/run/spire/secrets/bootstrap.key"
          cert_file_path = "/run/spire/config/bootstrap.crt"
        }
      }
    }

    health_checks {
      listener_enabled = true
      bind_address = "0.0.0.0"
      bind_port = "8080"
      live_path = "/live"
      ready_path = "/ready"
    }
  bootstrap.crt: |
    -----BEGIN CERTIFICATE-----
    MIIBzDCCAVOgAwIBAgIJAJM4DhRH0vmuMAoGCCqGSM49BAMEMB4xCzAJBgNVBAYT
    AlVTMQ8wDQYDVQQKDAZTUElGRkUwHhcNMTgwNTEzMTkzMzQ3WhcNMjMwNTEyMTkz
    MzQ3WjAeMQswCQYDVQQGEwJVUzEPMA0GA1UECgwGU1BJRkZFMHYwEAYHKoZIzj0C
    AQYFK4EEACIDYgAEWjB+nSGSxIYiznb84xu5WGDZj80nL7W1c3zf48Why0ma7Y7m
    CBKzfQkrgDguI4j0Z+0/tDH/r8gtOtLLrIpuMwWHoe4vbVBFte1vj6Xt6WeE8lXw
    cCvLs/mcmvPqVK9jo10wWzAdBgNVHQ4EFgQUh6XzV6LwNazA+GTEVOdu07o5yOgw
    DwYDVR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8EBAMCAQYwGQYDVR0RBBIwEIYOc3Bp
    ZmZlOi8vbG9jYWwwCgYIKoZIzj0EAwQDZwAwZAIwE4Me13qMC9i6Fkx0h26y09QZ
    IbuRqA9puLg9AeeAAyo5tBzRl1YL0KNEp02VKSYJAjBdeJvqjJ9wW55OGj1JQwDF
    D7kWeEB6oMlwPbI/5hEY3azJi16I0uN1JSYTSWGSqWc=
    -----END CERTIFICATE-----

---

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: spire-server
  namespace: spire
  labels:
    app: spire-server
spec:
  replicas: 1
  selector:
    matchLabels:
      app: spire-server
  serviceName: spire-server
  template:
    metadata:
      namespace: spire
      labels:
        app: spire-server
    spec:
      serviceAccountName: spire-server
      containers:
        - name: spire-server
          image: xxx/spire-server:1.0.1
          args: ["-config", "/run/spire/config/server.conf"]
          ports:
            - containerPort: 8081
          volumeMounts:
            - name: spire-config
              mountPath: /run/spire/config
              readOnly: true
            - name: spire-secrets
              mountPath: /run/spire/secrets
              readOnly: true
            - name: spire-data
              mountPath: /run/spire/data
              readOnly: false
          livenessProbe:
            httpGet:
              path: /live
              port: 8080
            failureThreshold: 2
            initialDelaySeconds: 15
            periodSeconds: 60
            timeoutSeconds: 3
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
      volumes:
        - name: spire-config
          configMap:
            name: spire-server
        - name: spire-secrets
          secret:
            secretName: spire-server
        - name: spire-data
          emptyDir: {}
---

apiVersion: v1
kind: Service
metadata:
  name: spire-server
  namespace: spire
spec:
  type: ClusterIP
  ports:
    - name: grpc
      port: 8081
      targetPort: 8081
      protocol: TCP
  selector:
    app: spire-server

spire-agent:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: spire-agent
  namespace: spire

---

apiVersion: v1
kind: ConfigMap
metadata:
  name: spire-agent
  namespace: spire
data:
  agent.conf: |
    agent {
      data_dir = "/run/spire"
      log_level = "DEBUG"
      server_address = "spire-server"
      server_port = "8081"
      socket_path = "/run/spire/sockets/agent.sock"
      trust_bundle_path = "/run/spire/config/bootstrap.crt"
      trust_domain = "example.org"
    }

    plugins {
      NodeAttestor "k8s_sat" {
        plugin_data {
          cluster = "demo-cluster"
        }
      }

      KeyManager "memory" {
        plugin_data {
        }
      }

      WorkloadAttestor "k8s" {
        plugin_data {
          # Defaults to the secure kubelet port by default.
          # Minikube does not have a cert in the cluster CA bundle that
          # can authenticate the kubelet cert, so skip validation.
          skip_kubelet_verification = "true"
          #kubelet_read_only_port = "10255"
          node_name_env = "MY_NODE_NAME"
        }
      }

      WorkloadAttestor "unix" {
          plugin_data {
          }
      }
    }

    health_checks {
      listener_enabled = true
      bind_address = "0.0.0.0"
      bind_port = "8080"
      live_path = "/live"
      ready_path = "/ready"
    }
  bootstrap.crt: |
    -----BEGIN CERTIFICATE-----
    MIIBzDCCAVOgAwIBAgIJAJM4DhRH0vmuMAoGCCqGSM49BAMEMB4xCzAJBgNVBAYT
    AlVTMQ8wDQYDVQQKDAZTUElGRkUwHhcNMTgwNTEzMTkzMzQ3WhcNMjMwNTEyMTkz
    MzQ3WjAeMQswCQYDVQQGEwJVUzEPMA0GA1UECgwGU1BJRkZFMHYwEAYHKoZIzj0C
    AQYFK4EEACIDYgAEWjB+nSGSxIYiznb84xu5WGDZj80nL7W1c3zf48Why0ma7Y7m
    CBKzfQkrgDguI4j0Z+0/tDH/r8gtOtLLrIpuMwWHoe4vbVBFte1vj6Xt6WeE8lXw
    cCvLs/mcmvPqVK9jo10wWzAdBgNVHQ4EFgQUh6XzV6LwNazA+GTEVOdu07o5yOgw
    DwYDVR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8EBAMCAQYwGQYDVR0RBBIwEIYOc3Bp
    ZmZlOi8vbG9jYWwwCgYIKoZIzj0EAwQDZwAwZAIwE4Me13qMC9i6Fkx0h26y09QZ
    IbuRqA9puLg9AeeAAyo5tBzRl1YL0KNEp02VKSYJAjBdeJvqjJ9wW55OGj1JQwDF
    D7kWeEB6oMlwPbI/5hEY3azJi16I0uN1JSYTSWGSqWc=
    -----END CERTIFICATE-----

---

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: spire-agent
  namespace: spire
  labels:
    app: spire-agent
spec:
  selector:
    matchLabels:
      app: spire-agent
  template:
    metadata:
      namespace: spire
      labels:
        app: spire-agent
    spec:
      hostPID: true
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      serviceAccountName: spire-agent
      initContainers:
        - name: init
          # This is a small image with wait-for-it, choose whatever image
          # you prefer that waits for a service to be up. This image is built
          # from https://github.com/lqhl/wait-for-it
          image: xxx/wait-for-it
          args: ["-t", "30", "spire-server:8081"]
          env:
            - name: MY_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
      containers:
        - name: spire-agent
          image: xxx/spire-agent:1.0.1
          args: ["-config", "/run/spire/config/agent.conf"]
          env:
            - name: MY_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          volumeMounts:
            - name: spire-config
              mountPath: /run/spire/config
              readOnly: true
            - name: spire-agent-socket
              mountPath: /run/spire/sockets
              readOnly: false
          livenessProbe:
            httpGet:
              path: /live
              port: 8080
            failureThreshold: 2
            initialDelaySeconds: 15
            periodSeconds: 60
            timeoutSeconds: 3
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
      volumes:
        - name: spire-config
          configMap:
            name: spire-agent
        - name: spire-agent-socket
          hostPath:
            path: /run/spire/sockets
            type: DirectoryOrCreate

When I get the certificate through spire-agent, I get an error. the logs of spire-agent:

time="2022-05-18T11:48:58Z" level=warning msg="Container id not found" attempt=7 container_id=d278326d91c8d21ce00c451ddd1dc0602a054ca96978b08fa7fae2b56cc9a676 external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-18T11:48:58Z" level=warning msg="Container id not found" attempt=8 container_id=d278326d91c8d21ce00c451ddd1dc0602a054ca96978b08fa7fae2b56cc9a676 external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-18T11:48:59Z" level=warning msg="Container id not found" attempt=9 container_id=d278326d91c8d21ce00c451ddd1dc0602a054ca96978b08fa7fae2b56cc9a676 external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-18T11:48:59Z" level=warning msg="Container id not found" attempt=10 container_id=d278326d91c8d21ce00c451ddd1dc0602a054ca96978b08fa7fae2b56cc9a676 external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-18T11:48:59Z" level=error msg="Received error from stream secrets server" error="<nil>" method=StreamSecrets pid=1193 service=SDS.v3 subsystem_name=endpoints
time="2022-05-18T11:48:59Z" level=error msg="Failed to collect all selectors for PID" error="workload attestor \"k8s\" failed: rpc error: code = Canceled desc = workloadattestor(k8s): context canceled" pid=1193 subsystem_name=workload_attestor
time="2022-05-18T11:48:59Z" level=debug msg="PID attested to have selectors" pid=1193 selectors="[type:\"unix\" value:\"uid:1337\" type:\"unix\" value:\"gid:1337\"]" subsystem_name=workload_attestor
time="2022-05-18T11:48:59Z" level=error msg="Failed to attest the workload" error="rpc error: code = Unauthenticated desc = could not verify existence of the original caller: caller is no longer being watched" method=StreamSecrets pid=1193 service=SDS.v3 subsystem_name=endpoints

the logs of spire-server:

time="2022-05-18T11:07:39Z" level=warning msg="Current umask 0022 is too permissive; setting umask 0027"
time="2022-05-18T11:07:39Z" level=info msg="Data directory: \"/run/spire/data\""
time="2022-05-18T11:07:39Z" level=info msg="Opening SQL database" db_type=sqlite3 subsystem_name=sql
time="2022-05-18T11:07:39Z" level=info msg="Initializing new database" subsystem_name=sql
time="2022-05-18T11:07:39Z" level=info msg="Connected to SQL database" read_only=false subsystem_name=sql type=sqlite3 version=3.34.0
time="2022-05-18T11:07:39Z" level=warning msg="The \"noop\" NodeResolver is not required, is deprecated, and will be removed from a future release" subsystem_name=catalog
time="2022-05-18T11:07:39Z" level=warning msg="The `service_account_whitelist` configurable is deprecated and will be removed in a future release. Please use `service_account_allow_list` instead." external=false plugin_name=k8s_sat plugin_type=NodeAttestor subsystem_name=catalog
time="2022-05-18T11:07:39Z" level=info msg="Plugin loaded" external=false plugin_name=k8s_sat plugin_type=NodeAttestor subsystem_name=catalog
time="2022-05-18T11:07:39Z" level=info msg="Plugin loaded" external=false plugin_name=disk plugin_type=KeyManager subsystem_name=catalog
time="2022-05-18T11:07:39Z" level=info msg="Plugin loaded" external=false plugin_name=disk plugin_type=UpstreamAuthority subsystem_name=catalog
time="2022-05-18T11:07:39Z" level=debug msg="Loading journal" path=/run/spire/data/journal.pem subsystem_name=ca_manager
time="2022-05-18T11:07:39Z" level=info msg="Journal loaded" jwt_keys=0 subsystem_name=ca_manager x509_cas=0
time="2022-05-18T11:07:39Z" level=debug msg="Preparing X509 CA" slot=A subsystem_name=ca_manager
time="2022-05-18T11:07:39Z" level=info msg="X509 CA prepared" expiration="2022-05-19T11:07:39Z" issued_at="2022-05-18T11:07:39Z" self_signed=false slot=A subsystem_name=ca_manager
time="2022-05-18T11:07:39Z" level=info msg="X509 CA activated" expiration="2022-05-19T11:07:39Z" issued_at="2022-05-18T11:07:39Z" slot=A subsystem_name=ca_manager
time="2022-05-18T11:07:39Z" level=debug msg="Successfully rotated X.509 CA" subsystem_name=ca_manager trust_domain_id="spiffe://example.org" ttl=86399.559774583
time="2022-05-18T11:07:39Z" level=debug msg="Preparing JWT key" slot=A subsystem_name=ca_manager
time="2022-05-18T11:07:39Z" level=warning msg="UpstreamAuthority plugin does not support JWT-SVIDs. Workloads managed by this server may have trouble communicating with workloads outside this cluster when using JWT-SVIDs." plugin_name=disk subsystem_name=ca_manager
time="2022-05-18T11:07:39Z" level=info msg="JWT key prepared" expiration="2022-05-19T11:07:39Z" issued_at="2022-05-18T11:07:39Z" slot=A subsystem_name=ca_manager
time="2022-05-18T11:07:39Z" level=info msg="JWT key activated" expiration="2022-05-19T11:07:39Z" issued_at="2022-05-18T11:07:39Z" slot=A subsystem_name=ca_manager
time="2022-05-18T11:07:39Z" level=debug msg="Rotating server SVID" subsystem_name=svid_rotator
time="2022-05-18T11:07:39Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:07:39Z" spiffe_id="spiffe://example.org/spire/server" subsystem_name=ca
time="2022-05-18T11:07:39Z" level=info msg="Building in-memory entry cache" subsystem_name=endpoints
time="2022-05-18T11:07:39Z" level=info msg="Completed building in-memory entry cache" subsystem_name=endpoints
time="2022-05-18T11:07:39Z" level=debug msg="Initializing API endpoints" subsystem_name=endpoints
time="2022-05-18T11:07:39Z" level=info msg="Starting TCP server" address="[::]:8081" subsystem_name=endpoints
time="2022-05-18T11:07:39Z" level=info msg="Starting UDS server" address=/tmp/spire-server/private/api.sock subsystem_name=endpoints
time="2022-05-18T11:07:40Z" level=debug msg="Starting checker" name=catalog.datastore subsystem_name=health
time="2022-05-18T11:07:40Z" level=debug msg="Starting checker" name=server.ca subsystem_name=health
time="2022-05-18T11:07:40Z" level=debug msg="Starting checker" name=server.ca.manager subsystem_name=health
time="2022-05-18T11:07:40Z" level=debug msg="Starting checker" name=server subsystem_name=health
time="2022-05-18T11:07:40Z" level=info msg="Serving health checks" address="0.0.0.0:8080" subsystem_name=health
time="2022-05-18T11:07:47Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:07:47Z" spiffe_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/1be11f43-e9e7-41b6-80a4-4dc816b1e889" subsystem_name=ca
time="2022-05-18T11:07:47Z" level=info msg="Agent attestation request completed" address="192.168.1.7:5204" agent_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/1be11f43-e9e7-41b6-80a4-4dc816b1e889" caller_addr="192.168.1.7:5204" method=AttestAgent node_attestor_type=k8s_sat request_id=7bd55951-34d2-4042-b3ac-18f165a0dc9b service=agent.v1.Agent subsystem_name=api
time="2022-05-18T11:07:48Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:07:48Z" spiffe_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/64823730-d786-4277-a9bc-106b8eb2f4ff" subsystem_name=ca
time="2022-05-18T11:07:48Z" level=info msg="Agent attestation request completed" address="192.168.1.6:5510" agent_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/64823730-d786-4277-a9bc-106b8eb2f4ff" caller_addr="192.168.1.6:5510" method=AttestAgent node_attestor_type=k8s_sat request_id=76052c90-a828-4a94-a16a-5d6be221e603 service=agent.v1.Agent subsystem_name=api
time="2022-05-18T11:11:17Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:11:17Z" spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=ca
time="2022-05-18T11:11:17Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:11:17Z" spiffe_id="spiffe://example.org/ns/istio-system/sa/istio-ingressgateway-service-account" subsystem_name=ca
time="2022-05-18T11:11:17Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:11:17Z" spiffe_id="spiffe://example.org/ns/default/sa/default" subsystem_name=ca
time="2022-05-18T11:11:17Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:11:17Z" spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=ca
time="2022-05-18T11:11:17Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:11:17Z" spiffe_id="spiffe://example.org/ns/istio-system/sa/istio-ingressgateway-service-account" subsystem_name=ca
time="2022-05-18T11:11:17Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:11:17Z" spiffe_id="spiffe://example.org/ns/default/sa/default" subsystem_name=ca
time="2022-05-18T11:11:32Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:11:32Z" spiffe_id="spiffe://example.org/ns/test/sa/httpbin" subsystem_name=ca
time="2022-05-18T11:11:32Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:11:32Z" spiffe_id="spiffe://example.org/ns/test/sa/httpbin" subsystem_name=ca
time="2022-05-18T11:13:55Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:13:55Z" spiffe_id="spiffe://example.org/ns/test/sa/sleep" subsystem_name=ca
time="2022-05-18T11:13:55Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:13:55Z" spiffe_id="spiffe://example.org/ns/test/sa/sleep" subsystem_name=ca
time="2022-05-18T11:28:36Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:28:36Z" spiffe_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/be1300e4-e93d-40f6-b987-6cb42d20e237" subsystem_name=ca
time="2022-05-18T11:28:36Z" level=info msg="Agent attestation request completed" address="192.168.1.7:26058" agent_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/be1300e4-e93d-40f6-b987-6cb42d20e237" caller_addr="192.168.1.7:26058" method=AttestAgent node_attestor_type=k8s_sat request_id=b00ead65-d001-46e3-89f5-5234442697c9 service=agent.v1.Agent subsystem_name=api
time="2022-05-18T11:28:42Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:28:42Z" spiffe_id="spiffe://example.org/ns/test/sa/httpbin" subsystem_name=ca
time="2022-05-18T11:28:42Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:28:42Z" spiffe_id="spiffe://example.org/ns/test/sa/sleep" subsystem_name=ca
time="2022-05-18T11:28:42Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:28:42Z" spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=ca
time="2022-05-18T11:28:42Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:28:42Z" spiffe_id="spiffe://example.org/ns/istio-system/sa/istio-ingressgateway-service-account" subsystem_name=ca
time="2022-05-18T11:28:42Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:28:42Z" spiffe_id="spiffe://example.org/ns/default/sa/default" subsystem_name=ca
time="2022-05-18T11:28:42Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:28:42Z" spiffe_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/bd4ae7d9-4fbf-4ba0-931c-efc4da0b6e08" subsystem_name=ca
time="2022-05-18T11:28:42Z" level=info msg="Agent attestation request completed" address="192.168.1.6:37898" agent_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/bd4ae7d9-4fbf-4ba0-931c-efc4da0b6e08" caller_addr="192.168.1.6:37898" method=AttestAgent node_attestor_type=k8s_sat request_id=7967a14e-ab5c-4bf9-aaef-21e2d43cf413 service=agent.v1.Agent subsystem_name=api
time="2022-05-18T11:28:47Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:28:47Z" spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=ca
time="2022-05-18T11:28:47Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:28:47Z" spiffe_id="spiffe://example.org/ns/istio-system/sa/istio-ingressgateway-service-account" subsystem_name=ca
time="2022-05-18T11:28:47Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:28:47Z" spiffe_id="spiffe://example.org/ns/default/sa/default" subsystem_name=ca
time="2022-05-18T11:28:47Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:28:47Z" spiffe_id="spiffe://example.org/ns/test/sa/httpbin" subsystem_name=ca
time="2022-05-18T11:28:47Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:28:47Z" spiffe_id="spiffe://example.org/ns/test/sa/sleep" subsystem_name=ca
time="2022-05-18T11:30:53Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:30:53Z" spiffe_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/93fe2161-95cc-4454-84d4-a510616a5dcf" subsystem_name=ca
time="2022-05-18T11:30:53Z" level=info msg="Agent attestation request completed" address="192.168.1.6:42279" agent_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/93fe2161-95cc-4454-84d4-a510616a5dcf" caller_addr="192.168.1.6:42279" method=AttestAgent node_attestor_type=k8s_sat request_id=c3bf5aac-0c6b-4ae5-9fb1-86bcd668f8ec service=agent.v1.Agent subsystem_name=api
time="2022-05-18T11:30:57Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:30:57Z" spiffe_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/ca6a5d08-3517-43a1-8ffc-dc0519a2dea7" subsystem_name=ca
time="2022-05-18T11:30:57Z" level=info msg="Agent attestation request completed" address="192.168.1.7:51286" agent_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/ca6a5d08-3517-43a1-8ffc-dc0519a2dea7" caller_addr="192.168.1.7:51286" method=AttestAgent node_attestor_type=k8s_sat request_id=9dd8a63e-5ace-4381-8a5a-80e2a1e5f0d1 service=agent.v1.Agent subsystem_name=api
time="2022-05-18T11:30:58Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:30:58Z" spiffe_id="spiffe://example.org/ns/test/sa/sleep" subsystem_name=ca
time="2022-05-18T11:30:58Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:30:58Z" spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=ca
time="2022-05-18T11:30:58Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:30:58Z" spiffe_id="spiffe://example.org/ns/istio-system/sa/istio-ingressgateway-service-account" subsystem_name=ca
time="2022-05-18T11:30:58Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:30:58Z" spiffe_id="spiffe://example.org/ns/default/sa/default" subsystem_name=ca
time="2022-05-18T11:30:58Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:30:58Z" spiffe_id="spiffe://example.org/ns/test/sa/httpbin" subsystem_name=ca
time="2022-05-18T11:31:02Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:31:02Z" spiffe_id="spiffe://example.org/ns/test/sa/sleep" subsystem_name=ca
time="2022-05-18T11:31:02Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:31:02Z" spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=ca
time="2022-05-18T11:31:02Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:31:02Z" spiffe_id="spiffe://example.org/ns/istio-system/sa/istio-ingressgateway-service-account" subsystem_name=ca
time="2022-05-18T11:31:02Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:31:02Z" spiffe_id="spiffe://example.org/ns/default/sa/default" subsystem_name=ca
time="2022-05-18T11:31:02Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:31:02Z" spiffe_id="spiffe://example.org/ns/test/sa/httpbin" subsystem_name=ca
time="2022-05-18T11:32:02Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:32:02Z" spiffe_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/32c3378f-b8b0-4df6-b6a0-18fb3bf824f7" subsystem_name=ca
time="2022-05-18T11:32:02Z" level=info msg="Agent attestation request completed" address="192.168.1.7:20470" agent_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/32c3378f-b8b0-4df6-b6a0-18fb3bf824f7" caller_addr="192.168.1.7:20470" method=AttestAgent node_attestor_type=k8s_sat request_id=37844d8e-27f5-42c6-a7be-019bfe6f9c82 service=agent.v1.Agent subsystem_name=api
time="2022-05-18T11:32:02Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:32:02Z" spiffe_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/7e068ae9-00f4-4786-9191-2b61cb7f08e9" subsystem_name=ca
time="2022-05-18T11:32:02Z" level=info msg="Agent attestation request completed" address="192.168.1.6:24604" agent_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/7e068ae9-00f4-4786-9191-2b61cb7f08e9" caller_addr="192.168.1.6:24604" method=AttestAgent node_attestor_type=k8s_sat request_id=60c6024e-eaa6-4f21-8d24-7f4dfff0eb24 service=agent.v1.Agent subsystem_name=api
time="2022-05-18T11:32:07Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:32:07Z" spiffe_id="spiffe://example.org/ns/test/sa/httpbin" subsystem_name=ca
time="2022-05-18T11:32:07Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:32:07Z" spiffe_id="spiffe://example.org/ns/test/sa/sleep" subsystem_name=ca
time="2022-05-18T11:32:07Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:32:07Z" spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=ca
time="2022-05-18T11:32:07Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:32:07Z" spiffe_id="spiffe://example.org/ns/istio-system/sa/istio-ingressgateway-service-account" subsystem_name=ca
time="2022-05-18T11:32:07Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:32:07Z" spiffe_id="spiffe://example.org/ns/default/sa/default" subsystem_name=ca
time="2022-05-18T11:32:07Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:32:07Z" spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=ca
time="2022-05-18T11:32:07Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:32:07Z" spiffe_id="spiffe://example.org/ns/istio-system/sa/istio-ingressgateway-service-account" subsystem_name=ca
time="2022-05-18T11:32:07Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:32:07Z" spiffe_id="spiffe://example.org/ns/default/sa/default" subsystem_name=ca
time="2022-05-18T11:32:07Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:32:07Z" spiffe_id="spiffe://example.org/ns/test/sa/httpbin" subsystem_name=ca
time="2022-05-18T11:32:07Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:32:07Z" spiffe_id="spiffe://example.org/ns/test/sa/sleep" subsystem_name=ca
time="2022-05-18T11:37:34Z" level=debug msg="Rotating server SVID" subsystem_name=svid_rotator
time="2022-05-18T11:37:34Z" level=debug msg="Signed X509 SVID" expiration="2022-05-18T12:37:34Z" spiffe_id="spiffe://example.org/spire/server" subsystem_name=ca

the agent list in spire-server.

/opt/spire/bin # /opt/spire/bin/spire-server agent list
Found 2 attested agents:

SPIFFE ID         : spiffe://example.org/spire/agent/k8s_sat/demo-cluster/56d7ed58-14ae-4cf5-9880-ffc123b314c1
Attestation type  : k8s_sat
Expiration time   : 2022-05-18 13:02:00 +0000 UTC
Serial number     : 319040273546637254342952962498965679004

SPIFFE ID         : spiffe://example.org/spire/agent/k8s_sat/demo-cluster/63588790-f382-492b-a324-359999d3a394
Attestation type  : k8s_sat
Expiration time   : 2022-05-18 13:02:00 +0000 UTC
Serial number     : 277908708266485423488447531291378329815

the entry in spire-server.

Found 5 entries
Entry ID         : 7ec60e3e-147d-444b-b026-a244279a03c6
SPIFFE ID        : spiffe://example.org/ns/default/sa/default
Parent ID        : spiffe://example.org/ns/spire/sa/spire-agent
Revision         : 0
TTL              : default
Selector         : k8s:ns:default
Selector         : k8s:sa:default

Entry ID         : cc99cf5e-cfd7-44ca-966d-b6313a540447
SPIFFE ID        : spiffe://example.org/ns/istio-system/sa/istio-ingressgateway-service-account
Parent ID        : spiffe://example.org/ns/spire/sa/spire-agent
Revision         : 0
TTL              : default
Selector         : k8s:ns:istio-system
Selector         : k8s:sa:istio-ingressgateway-service-account

Entry ID         : a689cef7-6972-46b1-8277-b3212f31c230
SPIFFE ID        : spiffe://example.org/ns/spire/sa/spire-agent
Parent ID        : spiffe://example.org/spire/server
Revision         : 0
TTL              : default
Selector         : k8s_sat:agent_ns:spire
Selector         : k8s_sat:agent_sa:spire-agent
Selector         : k8s_sat:cluster:demo-cluster

Entry ID         : be9bd775-85cb-4ffa-b914-5493c651d264
SPIFFE ID        : spiffe://example.org/ns/test/sa/httpbin
Parent ID        : spiffe://example.org/ns/spire/sa/spire-agent
Revision         : 0
TTL              : default
Selector         : k8s:ns:test
Selector         : k8s:sa:httpbin

Entry ID         : 3615702c-293e-4d39-9785-16a546c2462b
SPIFFE ID        : spiffe://example.org/ns/test/sa/sleep
Parent ID        : spiffe://example.org/ns/spire/sa/spire-agent
Revision         : 0
TTL              : default
Selector         : k8s:ns:test
Selector         : k8s:sa:sleep

tanjunchen avatar May 18 '22 11:05 tanjunchen

@tanjunchen It looks like your attest workload exit when the spire agent collects its selectors.

loveyana avatar May 19 '22 03:05 loveyana

workload

yes,but I want to get ca by SDS,it failed.

2022-05-19T05:41:41.975793Z	warning	envoy config	StreamSecrets gRPC config stream closed: 2, unable to retrieve all requested identities, missing map[default:true]

tanjunchen avatar May 19 '22 05:05 tanjunchen

@tanjunchen I've seen similar error, but was related to the fact I tried to fetch an identity from a process running in initContainer, which led to failing attestation.

radoslav-tomov avatar May 19 '22 07:05 radoslav-tomov

This error message indicates that the K8s workload attestor was unable to locate the pod containing the workload container after a configurable period of time. The process is basically as follows:

  1. Determine workload pod UID and container ID by inspecting the cgroups for the workload PID
  2. Query the kubelet for the pod listing and attempt to locate the workload pod by matching up the UID and container ID
  3. Repeat step 2 for 60 attempts, once every 500ms (for a total of 30 seconds) until successful. The number of attempts and inverval between attempts is configurable.
  4. If unsuccessful, fail with the error above
  5. If successful, gather selectors for the workload

To debug this, it may be necessary to perform these steps yourself to determine where the breakdown lies. From the agent container, you can query /proc/<workload pid>/cgroups to locate the pod UID and container ID. You can then curl the kubelet to get the pod listing, e.g.

$ curl -k -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://127.0.0.1:10250/pods

(caveat: i haven't tested this command)

You can then see if you can identify the workload pod.

azdagron avatar May 19 '22 14:05 azdagron

@tanjunchen I've seen similar error, but was related to the fact I tried to fetch an identity from a process running in initContainer, which led to failing attestation.

hello,could you explain the version of k8s,envoy and so and?do you have resole it ?

tanjunchen avatar May 22 '22 13:05 tanjunchen

This error message indicates that the K8s workload attestor was unable to locate the pod containing the workload container after a configurable period of time. The process is basically as follows:

  1. Determine workload pod UID and container ID by inspecting the cgroups for the workload PID
  2. Query the kubelet for the pod listing and attempt to locate the workload pod by matching up the UID and container ID
  3. Repeat step 2 for 60 attempts, once every 500ms (for a total of 30 seconds) until successful. The number of attempts and inverval between attempts is configurable.
  4. If unsuccessful, fail with the error above
  5. If successful, gather selectors for the workload

To debug this, it may be necessary to perform these steps yourself to determine where the breakdown lies. From the agent container, you can query /proc/<workload pid>/cgroups to locate the pod UID and container ID. You can then curl the kubelet to get the pod listing, e.g.

$ curl -k -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://127.0.0.1:10250/pods

(caveat: i haven't tested this command)

You can then see if you can identify the workload pod.

ok, Thanks for your explanation, thank you very much.

tanjunchen avatar May 22 '22 13:05 tanjunchen

@loveyana @radoslav-tomov @azdagron
I found the reason for the failure, because I turned on the holdApplicationUntilProxyStarts=true parameter when I installed istio.

apiVersion: operator.istio.io/v1alpha1
kind: IstioOperator
metadata:
  namespace: istio-system
spec:
  profile: default
  meshConfig:
    trustDomain: example.org
    accessLogFile: /dev/stdout
    defaultConfig:
      holdApplicationUntilProxyStarts: true
  values:
    gateways:
      istio-egressgateway:
        autoscaleEnabled: false
      istio-ingressgateway:
        autoscaleEnabled: false
    global:
      tag: 1.11.4
      proxy:
        privileged: true
        enableCoreDump: false
        resources:
          requests:
            cpu: 10m
            memory: 40Mi

This should be a bug in spire.

tanjunchen avatar May 23 '22 13:05 tanjunchen

I'm not sure I understand how that flag is impacting workload attestation. Envoy is the workload being attested in this situation, correct? Presumably the envoy container has a good status even if the application hasn't started?

Can you share the pod spec and status for the time period when workload attestation is being attempted?

azdagron avatar May 23 '22 14:05 azdagron

I'm not sure I understand how that flag is impacting workload attestation. Envoy is the workload being attested in this situation, correct? Presumably the envoy container has a good status even if the application hasn't started?

Can you share the pod spec and status for the time period when workload attestation is being attempted?

pls see https://istio.io/latest/docs/reference/config/istio.mesh.v1alpha1/ for holdApplicationUntilProxyStarts. You can use this parameter to install istio and then debug it.

tanjunchen avatar May 23 '22 14:05 tanjunchen

If you have the pod spec and status available, that would really help speed up debugging. If not, that is totally ok. It will however probably take some time for someone to find the time to reproduce.

@maxlambrecht have you seen this before?

azdagron avatar May 23 '22 15:05 azdagron

I'm not sure I understand how that flag is impacting workload attestation. Envoy is the workload being attested in this situation, correct? Presumably the envoy container has a good status even if the application hasn't started?

Can you share the pod spec and status for the time period when workload attestation is being attempted?

the logs of istio-proxy in httpbin-8cb97b8f4-cq9xh:

2022-05-24T11:30:08.361597Z	info	FLAG: --concurrency="2"
2022-05-24T11:30:08.361640Z	info	FLAG: --domain="test.svc.cluster.local"
2022-05-24T11:30:08.361647Z	info	FLAG: --help="false"
2022-05-24T11:30:08.361651Z	info	FLAG: --log_as_json="false"
2022-05-24T11:30:08.361655Z	info	FLAG: --log_caller=""
2022-05-24T11:30:08.361658Z	info	FLAG: --log_output_level="default:info"
2022-05-24T11:30:08.361662Z	info	FLAG: --log_rotate=""
2022-05-24T11:30:08.361666Z	info	FLAG: --log_rotate_max_age="30"
2022-05-24T11:30:08.361670Z	info	FLAG: --log_rotate_max_backups="1000"
2022-05-24T11:30:08.361675Z	info	FLAG: --log_rotate_max_size="104857600"
2022-05-24T11:30:08.361679Z	info	FLAG: --log_stacktrace_level="default:none"
2022-05-24T11:30:08.361688Z	info	FLAG: --log_target="[stdout]"
2022-05-24T11:30:08.361694Z	info	FLAG: --meshConfig="./etc/istio/config/mesh"
2022-05-24T11:30:08.361698Z	info	FLAG: --outlierLogPath=""
2022-05-24T11:30:08.361702Z	info	FLAG: --proxyComponentLogLevel="misc:error"
2022-05-24T11:30:08.361705Z	info	FLAG: --proxyLogLevel="warning"
2022-05-24T11:30:08.361710Z	info	FLAG: --serviceCluster="istio-proxy"
2022-05-24T11:30:08.361714Z	info	FLAG: --stsPort="0"
2022-05-24T11:30:08.361718Z	info	FLAG: --templateFile=""
2022-05-24T11:30:08.361722Z	info	FLAG: --tokenManagerPlugin="GoogleTokenExchange"
2022-05-24T11:30:08.361726Z	info	Version 1.11.4-9f6f03276054bb62a1b745630322314ef14969e8-Clean
2022-05-24T11:30:08.361932Z	info	Proxy role	ips=[10.168.1.45] type=sidecar id=httpbin-8cb97b8f4-cq9xh.test domain=test.svc.cluster.local
2022-05-24T11:30:08.362019Z	info	Apply proxy config from env {"holdApplicationUntilProxyStarts":true}

2022-05-24T11:30:08.363488Z	info	Effective config: binaryPath: /usr/local/bin/envoy
concurrency: 2
configPath: ./etc/istio/proxy
controlPlaneAuthPolicy: MUTUAL_TLS
discoveryAddress: istiod.istio-system.svc:15012
drainDuration: 45s
holdApplicationUntilProxyStarts: true
parentShutdownDuration: 60s
proxyAdminPort: 15000
serviceCluster: istio-proxy
statNameLength: 189
statusPort: 15020
terminationDrainDuration: 5s
tracing:
  zipkin:
    address: zipkin.istio-system:9411

2022-05-24T11:30:08.363511Z	info	JWT policy is third-party-jwt
2022-05-24T11:30:08.368407Z	info	CA Endpoint istiod.istio-system.svc:15012, provider Citadel
2022-05-24T11:30:08.368454Z	info	Using CA istiod.istio-system.svc:15012 cert with certs: var/run/secrets/istio/root-cert.pem
2022-05-24T11:30:08.368499Z	info	Opening status port 15020
2022-05-24T11:30:08.368631Z	info	citadelclient	Citadel client using custom root cert: istiod.istio-system.svc:15012
2022-05-24T11:30:08.417031Z	info	ads	All caches have been synced up in 58.487278ms, marking server ready
2022-05-24T11:30:08.417305Z	info	sds	SDS server for workload certificates started, listening on "etc/istio/proxy/SDS"
2022-05-24T11:30:08.417326Z	info	xdsproxy	Initializing with upstream address "istiod.istio-system.svc:15012" and cluster "Kubernetes"
2022-05-24T11:30:08.417655Z	info	Pilot SAN: [istiod.istio-system.svc]
2022-05-24T11:30:08.419914Z	info	Starting proxy agent
2022-05-24T11:30:08.419945Z	info	Epoch 0 starting
2022-05-24T11:30:08.419969Z	info	Envoy command: [-c etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --drain-strategy immediate --parent-shutdown-time-s 60 --local-address-ip-version v4 --bootstrap-version 3 --file-flush-interval-msec 1000 --disable-hot-restart --log-format %Y-%m-%dT%T.%fZ	%l	envoy %n	%v -l warning --component-log-level misc:error --concurrency 2]
2022-05-24T11:30:08.431416Z	info	sds	Starting SDS grpc server
2022-05-24T11:30:08.431503Z	info	starting Http service at 127.0.0.1:15004
2022-05-24T11:30:08.558314Z	info	xdsproxy	connected to upstream XDS server: istiod.istio-system.svc:15012
2022-05-24T11:30:08.694806Z	info	cache	generated new workload certificate	latency=274.064694ms ttl=23h59m59.305209069s
2022-05-24T11:30:08.694838Z	info	cache	Root cert has changed, start rotating root cert
2022-05-24T11:30:08.694857Z	info	ads	XDS: Incremental Pushing:0 ConnectedEndpoints:0 Version:
2022-05-24T11:30:08.694900Z	info	cache	returned workload trust anchor from cache	ttl=23h59m59.305104765s
2022-05-24T11:30:38.707668Z	warning	envoy config	StreamSecrets gRPC config stream closed: 2, unable to retrieve all requested identities, missing map[default:true]
2022-05-24T11:31:08.558782Z	info	Status server has successfully terminated
2022-05-24T11:31:08.558917Z	error	accept tcp [::]:15020: use of closed network connection
2022-05-24T11:31:08.558943Z	info	Agent draining Proxy
2022-05-24T11:31:08.560257Z	info	Graceful termination period is 5s, starting...
2022-05-24T11:31:09.207072Z	warning	envoy config	StreamSecrets gRPC config stream closed: 2, unable to retrieve all requested identities, missing map[default:true]
2022-05-24T11:31:13.560625Z	info	Graceful termination period complete, terminating remaining proxies.
2022-05-24T11:31:13.560658Z	warn	Aborted all epochs
2022-05-24T11:31:13.560671Z	warn	Aborting epoch 0
2022-05-24T11:31:13.560737Z	info	Epoch 0 aborted normally
2022-05-24T11:31:13.560742Z	info	Agent has successfully terminated

the httpbin pod info:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubectl.kubernetes.io/default-container: httpbin
    kubectl.kubernetes.io/default-logs-container: httpbin
    prometheus.io/path: /stats/prometheus
    prometheus.io/port: "15020"
    prometheus.io/scrape: "true"
    sidecar.istio.io/proxyImage: xxx/proxyv2-spire:1.11.4
    sidecar.istio.io/status: '{"initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["spire-agent-socket","istio-envoy","istio-data","istio-podinfo","istio-token","istiod-ca-cert"],"imagePullSecrets":null,"revision":"default"}'
  creationTimestamp: "2022-05-24T11:30:07Z"
  generateName: httpbin-8cb97b8f4-
  labels:
    app: httpbin
    pod-template-hash: 8cb97b8f4
    security.istio.io/tlsMode: istio
    service.istio.io/canonical-name: httpbin
    service.istio.io/canonical-revision: v1
    version: v1
  name: httpbin-8cb97b8f4-cq9xh
  namespace: test
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: httpbin-8cb97b8f4
    uid: d74b55af-9047-445d-8f6b-6f61acb2b299
  resourceVersion: "914430"
  uid: 733c68c1-77d9-4339-9188-57e97ee4d9cf
spec:
  containers:
  - args:
    - proxy
    - sidecar
    - --domain
    - $(POD_NAMESPACE).svc.cluster.local
    - --proxyLogLevel=warning
    - --proxyComponentLogLevel=misc:error
    - --log_output_level=default:info
    - --concurrency
    - "2"
    env:
    - name: JWT_POLICY
      value: third-party-jwt
    - name: PILOT_CERT_PROVIDER
      value: istiod
    - name: CA_ADDR
      value: istiod.istio-system.svc:15012
    - name: POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: POD_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: INSTANCE_IP
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.podIP
    - name: SERVICE_ACCOUNT
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.serviceAccountName
    - name: HOST_IP
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.hostIP
    - name: PROXY_CONFIG
      value: |
        {"holdApplicationUntilProxyStarts":true}
    - name: ISTIO_META_POD_PORTS
      value: |-
        [
            {"containerPort":80,"protocol":"TCP"}
        ]
    - name: ISTIO_META_APP_CONTAINERS
      value: httpbin
    - name: ISTIO_META_CLUSTER_ID
      value: Kubernetes
    - name: ISTIO_META_INTERCEPTION_MODE
      value: REDIRECT
    - name: ISTIO_META_WORKLOAD_NAME
      value: httpbin
    - name: ISTIO_META_OWNER
      value: kubernetes://apis/apps/v1/namespaces/test/deployments/httpbin
    - name: ISTIO_META_MESH_ID
      value: example.org
    - name: TRUST_DOMAIN
      value: example.org
    image: xxx/proxyv2-spire:1.11.4
    imagePullPolicy: IfNotPresent
    lifecycle:
      postStart:
        exec:
          command:
          - pilot-agent
          - wait
    name: istio-proxy
    ports:
    - containerPort: 15090
      name: http-envoy-prom
      protocol: TCP
    readinessProbe:
      failureThreshold: 30
      httpGet:
        path: /healthz/ready
        port: 15021
        scheme: HTTP
      initialDelaySeconds: 1
      periodSeconds: 2
      successThreshold: 1
      timeoutSeconds: 3
    resources:
      limits:
        cpu: "2"
        memory: 1Gi
      requests:
        cpu: 10m
        memory: 40Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      privileged: false
      readOnlyRootFilesystem: true
      runAsGroup: 1337
      runAsNonRoot: true
      runAsUser: 1337
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /run/spire/sockets
      name: spire-agent-socket
      readOnly: true
    - mountPath: /var/run/secrets/istio
      name: istiod-ca-cert
    - mountPath: /var/lib/istio/data
      name: istio-data
    - mountPath: /etc/istio/proxy
      name: istio-envoy
    - mountPath: /var/run/secrets/tokens
      name: istio-token
    - mountPath: /etc/istio/pod
      name: istio-podinfo
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: httpbin-token-9vh7z
      readOnly: true
  - image: docker.io/kennethreitz/httpbin
    imagePullPolicy: IfNotPresent
    name: httpbin
    ports:
    - containerPort: 80
      protocol: TCP
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: httpbin-token-9vh7z
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  initContainers:
  - args:
    - istio-iptables
    - -p
    - "15001"
    - -z
    - "15006"
    - -u
    - "1337"
    - -m
    - REDIRECT
    - -i
    - '*'
    - -x
    - ""
    - -b
    - '*'
    - -d
    - 15090,15021,15020
    image: xxx/proxyv2-spire:1.11.4
    imagePullPolicy: IfNotPresent
    name: istio-init
    resources:
      limits:
        cpu: "2"
        memory: 1Gi
      requests:
        cpu: 10m
        memory: 40Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        add:
        - NET_ADMIN
        - NET_RAW
        drop:
        - ALL
      privileged: false
      readOnlyRootFilesystem: false
      runAsGroup: 0
      runAsNonRoot: false
      runAsUser: 0
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: httpbin-token-9vh7z
      readOnly: true
  nodeName: 172.16.1.18
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: httpbin
  serviceAccountName: httpbin
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - hostPath:
      path: /run/spire/sockets
      type: ""
    name: spire-agent-socket
  - emptyDir:
      medium: Memory
    name: istio-envoy
  - emptyDir: {}
    name: istio-data
  - downwardAPI:
      defaultMode: 420
      items:
      - fieldRef:
          apiVersion: v1
          fieldPath: metadata.labels
        path: labels
      - fieldRef:
          apiVersion: v1
          fieldPath: metadata.annotations
        path: annotations
    name: istio-podinfo
  - name: istio-token
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          audience: istio-ca
          expirationSeconds: 43200
          path: istio-token
  - configMap:
      defaultMode: 420
      name: istio-ca-root-cert
    name: istiod-ca-cert
  - name: httpbin-token-9vh7z
    secret:
      defaultMode: 420
      secretName: httpbin-token-9vh7z
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-05-24T11:30:08Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2022-05-24T11:30:07Z"
    message: 'containers with unready status: [istio-proxy]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2022-05-24T11:30:07Z"
    message: 'containers with unready status: [istio-proxy]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2022-05-24T11:30:07Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://db85e8f4e6f72c76f9ccbb609e5fcdd99bb98232534a323421dff4835fcc4758
    image: kennethreitz/httpbin:latest
    imageID: docker-pullable://kennethreitz/httpbin@sha256:599fe5e5073102dbb0ee3dbb65f049dab44fa9fc251f6835c9990f8fb196a72b
    lastState: {}
    name: httpbin
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2022-05-24T11:31:13Z"
  - containerID: docker://5611ac1fcd711e80a919c58eeb3141b903d56f61cc673e67e95b4691782de3d8
    image: xxx/proxyv2-spire:1.11.4
    imageID: docker-pullable://xxx/proxyv2-spire@sha256:ea39594c07b3bc7ede611354bfb55b8e88c3bef76b636cc8549cc49dbfd5ef2c
    lastState:
      terminated:
        containerID: docker://5611ac1fcd711e80a919c58eeb3141b903d56f61cc673e67e95b4691782de3d8
        exitCode: 0
        finishedAt: "2022-05-24T11:32:19Z"
        reason: Completed
        startedAt: "2022-05-24T11:31:14Z"
    name: istio-proxy
    ready: false
    restartCount: 1
    started: false
    state:
      waiting:
        message: back-off 10s restarting failed container=istio-proxy pod=httpbin-8cb97b8f4-cq9xh_test(733c68c1-77d9-4339-9188-57e97ee4d9cf)
        reason: CrashLoopBackOff
  hostIP: 172.16.1.18
  initContainerStatuses:
  - containerID: docker://67f5ecdaa3ca90e20c6e14ebb6d9c8d9a9b9f72bb8c1ded932007ae167095b48
    image: xxx/proxyv2-spire:1.11.4
    imageID: docker-pullable://xxx/proxyv2-spire@sha256:ea39594c07b3bc7ede611354bfb55b8e88c3bef76b636cc8549cc49dbfd5ef2c
    lastState: {}
    name: istio-init
    ready: true
    restartCount: 0
    state:
      terminated:
        containerID: docker://67f5ecdaa3ca90e20c6e14ebb6d9c8d9a9b9f72bb8c1ded932007ae167095b48
        exitCode: 0
        finishedAt: "2022-05-24T11:30:08Z"
        reason: Completed
        startedAt: "2022-05-24T11:30:07Z"
  phase: Running
  podIP: 10.168.1.45
  podIPs:
  - ip: 10.168.1.45
  qosClass: Burstable
  startTime: "2022-05-24T11:30:07Z"

the spire-agent log:

time="2022-05-24T11:36:44Z" level=warning msg="Current umask 0022 is too permissive; setting umask 0027"
time="2022-05-24T11:36:44Z" level=info msg="Starting agent with data directory: \"/run/spire\""
time="2022-05-24T11:36:44Z" level=info msg="Plugin loaded" external=false plugin_name=k8s plugin_type=WorkloadAttestor subsystem_name=catalog
time="2022-05-24T11:36:44Z" level=info msg="Plugin loaded" external=false plugin_name=unix plugin_type=WorkloadAttestor subsystem_name=catalog
time="2022-05-24T11:36:44Z" level=info msg="Plugin loaded" external=false plugin_name=k8s_sat plugin_type=NodeAttestor subsystem_name=catalog
time="2022-05-24T11:36:44Z" level=info msg="Plugin loaded" external=false plugin_name=memory plugin_type=KeyManager subsystem_name=catalog
time="2022-05-24T11:36:44Z" level=info msg="Bundle loaded" subsystem_name=attestor trust_domain_id="spiffe://example.org"
time="2022-05-24T11:36:44Z" level=debug msg="No pre-existing agent SVID found. Will perform node attestation" path=/run/spire/agent_svid.der subsystem_name=attestor
time="2022-05-24T11:36:44Z" level=info msg="SVID is not found. Starting node attestation" subsystem_name=attestor trust_domain_id="spiffe://example.org"
time="2022-05-24T11:36:44Z" level=info msg="Node attestation was successful" spiffe_id="spiffe://example.org/spire/agent/k8s_sat/demo-cluster/5df69275-2a24-48f1-8f24-84cf85ec8dc2" subsystem_name=attestor trust_domain_id="spiffe://example.org"
time="2022-05-24T11:36:44Z" level=info msg="Starting Workload and SDS APIs" subsystem_name=endpoints
time="2022-05-24T11:36:45Z" level=debug msg="Starting checker" name=agent subsystem_name=health
time="2022-05-24T11:36:45Z" level=info msg="Serving health checks" address="0.0.0.0:8080" subsystem_name=health
time="2022-05-24T11:36:46Z" level=warning msg="Failed to lookup user name by uid" error="user: unknown userid 1337" external=false plugin_name=unix plugin_type=WorkloadAttestor subsystem_name=catalog uid=1337
time="2022-05-24T11:36:46Z" level=warning msg="Failed to lookup group name by gid" error="group: unknown groupid 1337" external=false gid=1337 plugin_name=unix plugin_type=WorkloadAttestor subsystem_name=catalog
time="2022-05-24T11:36:46Z" level=warning msg="Container id not found" attempt=1 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:46Z" level=warning msg="Failed to lookup user name by uid" error="user: unknown userid 1337" external=false plugin_name=unix plugin_type=WorkloadAttestor subsystem_name=catalog uid=1337
time="2022-05-24T11:36:46Z" level=warning msg="Failed to lookup group name by gid" error="group: unknown groupid 1337" external=false gid=1337 plugin_name=unix plugin_type=WorkloadAttestor subsystem_name=catalog
time="2022-05-24T11:36:46Z" level=warning msg="Container id not found" attempt=1 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:46Z" level=warning msg="Container id not found" attempt=2 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:47Z" level=warning msg="Container id not found" attempt=2 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:47Z" level=warning msg="Container id not found" attempt=3 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:47Z" level=warning msg="Container id not found" attempt=3 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:47Z" level=warning msg="Container id not found" attempt=4 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:48Z" level=warning msg="Container id not found" attempt=4 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:48Z" level=warning msg="Container id not found" attempt=5 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:48Z" level=warning msg="Container id not found" attempt=5 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:48Z" level=warning msg="Container id not found" attempt=6 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:49Z" level=warning msg="Container id not found" attempt=6 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:49Z" level=warning msg="Container id not found" attempt=7 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:49Z" level=warning msg="Container id not found" attempt=7 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:49Z" level=debug msg="Entry created" entry=58638e67-9f90-4fbb-8f80-dcc7aacbd56e selectors_added=3 spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=cache_manager
time="2022-05-24T11:36:49Z" level=debug msg="Entry created" entry=abbd2a5c-1994-4df9-9131-2a0280608da8 selectors_added=2 spiffe_id="spiffe://example.org/ns/istio-system/sa/istio-ingressgateway-service-account" subsystem_name=cache_manager
time="2022-05-24T11:36:49Z" level=debug msg="Entry created" entry=e3b0ab8d-cb04-41e6-9d8e-9a41a2874b5c selectors_added=2 spiffe_id="spiffe://example.org/ns/default/sa/default" subsystem_name=cache_manager
time="2022-05-24T11:36:49Z" level=debug msg="Entry created" entry=a2c9da2f-c94f-47da-a002-4e695c16d055 selectors_added=2 spiffe_id="spiffe://example.org/ns/test/sa/sleep" subsystem_name=cache_manager
time="2022-05-24T11:36:49Z" level=debug msg="Entry created" entry=364f3b4a-073e-43bc-b502-5184e1f57f30 selectors_added=2 spiffe_id="spiffe://example.org/ns/test/sa/httpbin" subsystem_name=cache_manager
time="2022-05-24T11:36:49Z" level=debug msg="Renewing stale entries" count=5 limit=500 subsystem_name=manager
time="2022-05-24T11:36:49Z" level=info msg="Renewing X509-SVID" spiffe_id="spiffe://example.org/ns/test/sa/sleep" subsystem_name=manager
time="2022-05-24T11:36:49Z" level=info msg="Renewing X509-SVID" spiffe_id="spiffe://example.org/ns/test/sa/httpbin" subsystem_name=manager
time="2022-05-24T11:36:49Z" level=info msg="Renewing X509-SVID" spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=manager
time="2022-05-24T11:36:49Z" level=info msg="Renewing X509-SVID" spiffe_id="spiffe://example.org/ns/istio-system/sa/istio-ingressgateway-service-account" subsystem_name=manager
time="2022-05-24T11:36:49Z" level=info msg="Renewing X509-SVID" spiffe_id="spiffe://example.org/ns/default/sa/default" subsystem_name=manager
time="2022-05-24T11:36:49Z" level=debug msg="SVID updated" entry=58638e67-9f90-4fbb-8f80-dcc7aacbd56e spiffe_id="spiffe://example.org/ns/spire/sa/spire-agent" subsystem_name=cache_manager
time="2022-05-24T11:36:49Z" level=debug msg="SVID updated" entry=abbd2a5c-1994-4df9-9131-2a0280608da8 spiffe_id="spiffe://example.org/ns/istio-system/sa/istio-ingressgateway-service-account" subsystem_name=cache_manager
time="2022-05-24T11:36:49Z" level=debug msg="SVID updated" entry=e3b0ab8d-cb04-41e6-9d8e-9a41a2874b5c spiffe_id="spiffe://example.org/ns/default/sa/default" subsystem_name=cache_manager
time="2022-05-24T11:36:49Z" level=debug msg="SVID updated" entry=a2c9da2f-c94f-47da-a002-4e695c16d055 spiffe_id="spiffe://example.org/ns/test/sa/sleep" subsystem_name=cache_manager
time="2022-05-24T11:36:49Z" level=debug msg="SVID updated" entry=364f3b4a-073e-43bc-b502-5184e1f57f30 spiffe_id="spiffe://example.org/ns/test/sa/httpbin" subsystem_name=cache_manager
time="2022-05-24T11:36:49Z" level=warning msg="Container id not found" attempt=8 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:50Z" level=warning msg="Container id not found" attempt=8 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:50Z" level=warning msg="Container id not found" attempt=9 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:50Z" level=warning msg="Container id not found" attempt=9 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:50Z" level=warning msg="Container id not found" attempt=10 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:51Z" level=warning msg="Container id not found" attempt=10 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:51Z" level=warning msg="Container id not found" attempt=11 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:51Z" level=warning msg="Container id not found" attempt=11 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:51Z" level=warning msg="Container id not found" attempt=12 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:52Z" level=warning msg="Container id not found" attempt=12 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:52Z" level=warning msg="Container id not found" attempt=13 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:52Z" level=warning msg="Container id not found" attempt=13 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:52Z" level=warning msg="Container id not found" attempt=14 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:53Z" level=warning msg="Container id not found" attempt=14 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:53Z" level=warning msg="Container id not found" attempt=15 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:53Z" level=warning msg="Container id not found" attempt=15 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:53Z" level=warning msg="Container id not found" attempt=16 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:54Z" level=warning msg="Container id not found" attempt=16 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:54Z" level=warning msg="Container id not found" attempt=17 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:54Z" level=warning msg="Container id not found" attempt=17 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:54Z" level=warning msg="Container id not found" attempt=18 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:55Z" level=warning msg="Container id not found" attempt=18 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:55Z" level=warning msg="Container id not found" attempt=19 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:55Z" level=warning msg="Container id not found" attempt=19 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:55Z" level=warning msg="Container id not found" attempt=20 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:56Z" level=warning msg="Container id not found" attempt=20 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:56Z" level=warning msg="Container id not found" attempt=21 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:56Z" level=warning msg="Container id not found" attempt=21 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:56Z" level=warning msg="Container id not found" attempt=22 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:57Z" level=warning msg="Container id not found" attempt=22 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:57Z" level=warning msg="Container id not found" attempt=23 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:57Z" level=warning msg="Container id not found" attempt=23 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:58Z" level=warning msg="Container id not found" attempt=24 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:58Z" level=warning msg="Container id not found" attempt=24 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:58Z" level=warning msg="Container id not found" attempt=25 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:58Z" level=warning msg="Container id not found" attempt=25 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:59Z" level=warning msg="Container id not found" attempt=26 container_id=22f9dfc2413f24f21995e602657b37fbecadd003134566a1c08dd1a4900bfb2f external=false plugin_name=k8s plugin_type=WorkloadAttestor retry_interval=500ms subsystem_name=catalog
time="2022-05-24T11:36:59Z" level=error msg="Failed to collect all selectors for PID" error="workload attestor \"k8s\" failed: rpc error: code = Canceled desc = workloadattestor(k8s): context canceled" pid=23212 subsystem_name=workload_attestor
time="2022-05-24T11:36:59Z" level=debug msg="PID attested to have selectors" pid=23212 selectors="[type:\"unix\" value:\"uid:1337\" type:\"unix\" value:\"gid:1337\"]" subsystem_name=workload_attestor
time="2022-05-24T11:36:59Z" level=error msg="Failed to attest the workload" error="rpc error: code = Unauthenticated desc = could not verify existence of the original caller: caller is no longer being watched" method=StreamSecrets pid=23212 service=SDS.v3 subsystem_name=endpoints
time="2022-05-24T11:36:59Z" level=error msg="Failed to collect all selectors for PID" error="workload attestor \"k8s\" failed: rpc error: code = Canceled desc = workloadattestor(k8s): context canceled" pid=23212 subsystem_name=workload_attestor
time="2022-05-24T11:36:59Z" level=debug msg="PID attested to have selectors" pid=23212 selectors="[type:\"unix\" value:\"uid:1337\" type:\"unix\" value:\"gid:1337\"]" subsystem_name=workload_attestor
time="2022-05-24T11:36:59Z" level=error msg="Failed to attest the workload" error="rpc error: code = Unauthenticated desc = could not verify existence of the original caller: caller is no longer being watched" method=StreamSecrets pid=23212 service=SDS.v3 subsystem_name=endpoints

@azdagron

tanjunchen avatar May 24 '22 11:05 tanjunchen

The container ID that SPIRE is trying to attest is not present in the pod status. Is it possible that the Envoy proxy is crashing on startup? That would explain why SPIRE can't find the container and why the peertracker layer is no longer able to watch the calling process. What do the envoy logs show?

azdagron avatar May 24 '22 19:05 azdagron

@azdagron
There are logs from envoy above, envoy has been waiting to get a certificate from spire. envoy and spire are waiting for each other。

tanjunchen avatar May 25 '22 02:05 tanjunchen

What version of Istio is this? What environment is it deployed into (your own k8s cluster, minikube, k3s, kind, etc)? What container runtime is being used? This will help me reproduce.

If SPIRE can't find the container in any of the returned pod status, then one possibility is that SPIRE has misidentified the container ID from the cgroups.

Is it possible for you to:

  1. Open a shell in the agent container
  2. Deploy your workload
  3. Identify the envoy proxy pid
  4. cat /proc/<pid>/cgroups and paste the contents here

azdagron avatar May 25 '22 16:05 azdagron

I was able to reproduce the error using Istio 1.14 with the configuration:

meshConfig:
    defaultConfig:
      holdApplicationUntilProxyStarts: true

I see the same error on the SPIRE Agent logs, related to the attestation of the process:

time="2022-05-25T17:38:30Z" level=error msg="Failed to attest the workload" error="rpc error: code = Unauthenticated desc = could not verify existence of the original caller: caller is no longer being watched" method=StreamSecrets pid=363156 service=SDS.v3 subsystem_name=endpoints
time="2022-05-25T17:38:30Z" level=warning msg="Container id not found" attempt=3 container_id=99262a0c3ec931bea34af20df0da4bb80ce114d756a31adb816a1864ddb9ef62 external=false plugin_name=k8s plugin_type=WorkloadAttestor pod_uid=9a2534a1-bf7e-4d3a-9f70-f4273b98c73e retry_interval=500ms subsystem_name=catalog

maxlambrecht avatar May 25 '22 17:05 maxlambrecht

What version of Istio is this? What environment is it deployed into (your own k8s cluster, minikube, k3s, kind, etc)? What container runtime is being used? This will help me reproduce.

If SPIRE can't find the container in any of the returned pod status, then one possibility is that SPIRE has misidentified the container ID from the cgroups.

Is it possible for you to:

  1. Open a shell in the agent container
  2. Deploy your workload
  3. Identify the envoy proxy pid
  4. cat /proc/<pid>/cgroups and paste the contents here

@azdagron @maxlambrecht

➜  in-cluster git:(master) ✗ istioctl version
client version: 1.11.4
control plane version: 1.11.4
data plane version: 1.11.4 (3 proxies)
➜  in-cluster git:(master) ✗ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.8", GitCommit:"5575935422cc1cf5169dfc8847cb587aa47bac5a", GitTreeState:"clean", BuildDate:"2021-06-16T12:53:07Z", GoVersion:"go1.15.13", Compiler:"gc", Platform:"linux/amd64"}

level = debug msg = "PID attested to have selectors"
pid = 9103 selectors = "[type:\"unix\" value:\"uid:1337\" type:\"unix\" value:\"gid:1337\" type:\"k8s\" value:\"sa:httpbin\" type:\"k8s\" value:\"ns:test\" type:\"k8s\" value:\"node-name:172.16.1.18\" type:\"k8s\" value:\"pod-uid:6ae56970-d26b-4155-9f83-f9e258e2648b\" type:\"k8s\" value:\"pod-name:httpbin-8cb97b8f4-cqqgm\" type:\"k8s\" value:\"container-name:istio-proxy\" type:\"k8s\" value:\"pod-image-count:2\" type:\"k8s\" value:\"pod-init-image-count:1\" type:\"k8s\" value:\"container-image:docker-pullable://x/bms-mesh/proxyv2-spire@sha256:ea39594c07b3bc7ede611354bfb55b8e88c3bef76b636cc8549cc49dbfd5ef2c\" type:\"k8s\" value:\"container-image:x/bms-mesh/proxyv2-spire:1.11.4\" type:\"k8s\" value:\"pod-image:docker-pullable://kennethreitz/httpbin@sha256:599fe5e5073102dbb0ee3dbb65f049dab44fa9fc251f6835c9990f8fb196a72b\" type:\"k8s\" value:\"pod-image:kennethreitz/httpbin:latest\" type:\"k8s\" value:\"pod-image:docker-pullable://x/bms-mesh/proxyv2-spire@sha256:ea39594c07b3bc7ede611354bfb55b8e88c3bef76b636cc8549cc49dbfd5ef2c\" type:\"k8s\" value:\"pod-image:x/bms-mesh/proxyv2-spire:1.11.4\" type:\"k8s\" value:\"pod-init-image:docker-pullable://x/bms-mesh/proxyv2-spire@sha256:ea39594c07b3bc7ede611354bfb55b8e88c3bef76b636cc8549cc49dbfd5ef2c\" type:\"k8s\" value:\"pod-init-image:x/bms-mesh/proxyv2-spire:1.11.4\" type:\"k8s\" value:\"pod-label:app:httpbin\" type:\"k8s\" value:\"pod-label:pod-template-hash:8cb97b8f4\" type:\"k8s\" value:\"pod-label:security.istio.io/tlsMode:istio\" type:\"k8s\" value:\"pod-label:service.istio.io/canonical-name:httpbin\" type:\"k8s\" value:\"pod-label:service.istio.io/canonical-revision:v1\" type:\"k8s\" value:\"pod-label:version:v1\" type:\"k8s\" value:\"pod-owner:ReplicaSet:httpbin-8cb97b8f4\" type:\"k8s\" value:\"pod-owner-uid:ReplicaSet:d74b55af-9047-445d-8f6b-6f61acb2b299\"]"
subsystem_name = workload_attestor

the pid is 9103.

/opt/spire # ls -l /proc | grep 9103
/opt/spire #
nothing in  spire-agent.

tanjunchen avatar May 26 '22 08:05 tanjunchen

I've repro'd this and am digging in as I have time. From what I can tell, the envoy containers started up in the workload pods aren't showing up in the pod listing retrieved from the kubelet. I'm still trying to figure out why.

azdagron avatar May 26 '22 19:05 azdagron

@azdagron I don't recall al the details, but I've noticed that the pod needs to be in certain status for the info to be available. For example I've seen that prior to condition Initialized to be completed, there is no pod listing as well. This is why I mentioned there is no way to get identity in init container. Hope this helps.

radoslav-tomov avatar May 27 '22 05:05 radoslav-tomov

Hmm, I haven't been able to observe that behavior. If I stand up a vanilla kubernetes cluster (via kind 0.12.0) and deploy a workload with an init container that never terminates and is thus perpetually in an Init state, I can still gather the init container's pod UID and container ID via cgroups from the agent container and see them in the pod listing returned by the kubelet.

azdagron avatar Jun 08 '22 19:06 azdagron

@faisal-memon Could you take a look?

rturner3 avatar Jun 14 '22 19:06 rturner3

sounds good

faisal-memon avatar Jun 14 '22 19:06 faisal-memon

Was able to reproduce this. The container ID is not getting populated in the pod container status while it is in the initializing state. Maybe the init container is different?

faisal-memon avatar Jun 21 '22 22:06 faisal-memon

Was able to reproduce this. The container ID is not getting populated in the pod container status while it is in the initializing state. Maybe the init container is different?

@faisal-memon @rturner3 how is it?

tanjunchen avatar Jun 23 '22 11:06 tanjunchen

@faisal-memon @rturner3 how is it?

How is it different than init container? Im not sure, just guessing that init container populates container id sooner to explain the difference with what @azdagron observed. There are two ways to resolve this issue:

  1. Change Kubernetes to populate container id sooner
  2. Remove container id check from SPIRE

Im exploring option 1 right now.

faisal-memon avatar Jun 23 '22 22:06 faisal-memon

@faisal-memon hello, I see the above issue for kubernetes Pod Pid。 It seems that it is not easy to change k8s api recently. Maybe We can consider the solution 2,and then upgrade to the 1st solution when the k8s api allow this function?ok?

tanjunchen avatar Aug 06 '22 14:08 tanjunchen

IIRC, removing the container ID check is not tenable. We use that check to sharp shoot the correct container from the pod to generate accurate selectors for the workload. If we cannot identify the container in the pod, then attestation is weakened.

azdagron avatar Aug 09 '22 19:08 azdagron

@tanjunchen may I ask why you want to use holdApplicationUntilProxyStarts=true? Is it something related to SPIRE, or something different?

evan2645 avatar Sep 20 '22 19:09 evan2645

@tanjunchen may I ask why you want to use holdApplicationUntilProxyStarts=true? Is it something related to SPIRE, or something different?

see https://github.com/spiffe/spire/issues/3092#issuecomment-1134783056 @evan2645

tanjunchen avatar Sep 21 '22 01:09 tanjunchen

holdApplicationUntilProxyStarts indicates whether the application starts first or istio-proxy starts first.

tanjunchen avatar Sep 21 '22 01:09 tanjunchen

Thanks @tanjunchen - yes I understand what the feature does, I am wondering why you need it

It doesn't look like this issue will be resolved on the k8s side any time soon. The next best thing could be to convince istio to implement the feature differently ... that also seems like a long shot but is worth it if we can find an alternate mechanism (one that does not lean on these lifecycle hooks).

In the meantime, I raised https://github.com/spiffe/spire/pull/3443 which adds this as a known issue to the plugin docs

evan2645 avatar Sep 21 '22 18:09 evan2645