gatekeeper icon indicating copy to clipboard operation
gatekeeper copied to clipboard

failed to run Match criteria: namespace selector for namespace-scoped object but missing Namespace

Open btwseeu78 opened this issue 1 year ago • 5 comments

Gatekeeper constraints giving unusual error while evaluating.

controller version: "v3.9.0" CRD version matches controller. GKE version : 1.24.3-gke.2100

The process is bit unusual.

constraint template

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8srequiredprorityclass
  annotations:
    descriptions: >-
      required to restrict project from using infra and other cluster critical namespaces
spec:
  crd:
    spec:
      names:
        kind: k8sRequiredProrityClass
      validation:
        openAPIV3Schema:
          type: object
          properties:
            allowedclassnames:
              type: array
              descrption: Allowed PriorityClass
              items:
                type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredprorityclass
        pod_name = input.review.object.metadata.name
        violation[{"msg" : msg}] {
          namespace := input.review.object.metadata.namespace
          satisfiedclass := [name | input.parameters.allowedclassnames[i] == input.review.object.spec.priorityClassName; name := input.review.object.spec.priorityClassName]
          not count(satisfiedclass) > 0
          msg := sprintf("The provided priority class  - %v - is not allowed in this namespace -  %v - for pod - %v -",[input.review.object.spec.priorityClassName,namespace,pod_name])
        }

Constraint

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: k8sRequiredProrityClass
metadata:
  name: priorityconstrain
spec:
  enforcementAction: warn
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
    namespaceSelector:
       matchLabels:
        platform.test.com/user: "true"
    excludedNamespaces:
      - "kube-*"
    
  parameters:
    allowedclassnames:
      - "sta"
      - "crit"
      - "sta"

And the error in the constraint

Status:
  Audit Timestamp:  2022-09-30T11:53:28Z
  By Pod:
    Constraint UID:       ff136ffc-4dc1-4155-81e2-d7d638d51fef
    Enforced:             true
    Id:                   gatekeeper-audit-6bd76968b4-9dxd8
    Observed Generation:  1
    Operations:
      audit
      mutation-status
      status
    Constraint UID:       ff136ffc-4dc1-4155-81e2-d7d638d51fef
    Enforced:             true
    Id:                   gatekeeper-controller-manager-5d88d5f88c-2vb48
    Observed Generation:  1
    Operations:
      mutation-webhook
      webhook
    Constraint UID:       ff136ffc-4dc1-4155-81e2-d7d638d51fef
    Enforced:             true
    Id:                   gatekeeper-controller-manager-5d88d5f88c-85j5l
    Observed Generation:  1
    Operations:
      mutation-webhook
      webhook
    Constraint UID:       ff136ffc-4dc1-4155-81e2-d7d638d51fef
    Enforced:             true
    Id:                   gatekeeper-controller-manager-5d88d5f88c-n74hm
    Observed Generation:  1
    Operations:
      mutation-webhook
      webhook
  Total Violations:  1
  Violations:
    Enforcement Action:  warn
    Group:
    Kind:                Pod
    Message:             unable to match constraints: error matching the requested object: failed to run Match criteria: namespace selector for namespace-scoped object but missing Namespace
    Name:                gatekeeper-audit-6bd76968b4-9dxd8
    Namespace:           gatekeeper-system
    Version:             v1
Events:                  <none>

Similar error for some kube-system objects as well.

not able to replicate with kind ,kind works just fine.

but you guys might help point out from where this is coming off.

btwseeu78 avatar Sep 30 '22 13:09 btwseeu78

I'm not able to replicate the error in a Kind cluster.

That error shows up if, for some reason, Gatekeeper's match logic doesn't have a copy of the Namespace the object is in in order to evaluate the namespace selector.

Are you using --audit-from-cache? It looks like that code may need updating to provide the namespace as-appropriate:

https://github.com/open-policy-agent/gatekeeper/blob/35b9cbd0049d7c586d25acad8de6d6fd70128ed1/pkg/audit/manager.go#L456-L483

If you are using --audit-from-cache, does disabling that flag fix the error?

maxsmythe avatar Oct 01 '22 00:10 maxsmythe

same for me its happening in my prod,but I'm not able to replicate this in kind.

btwseeu78 avatar Oct 01 '22 05:10 btwseeu78

i will try your suggestion

btwseeu78 avatar Oct 01 '22 05:10 btwseeu78

yes , @maxsmythe by disabling audit from cache does solve my problem ,i will still keep it on look just to make sure any violation are not getting missed from being reported or not.

Though I don't have much idea how to make people reproduce this issue.

btwseeu78 avatar Oct 07 '22 09:10 btwseeu78

Since disabling audit from cache fixes the issue, I think that validates my theory above. This should be fixable, but as a mitigation, I'd suggest disabling the flag if possible.

maxsmythe avatar Oct 07 '22 21:10 maxsmythe

so disabling the cache fixed the issue, but i tried to check some more to validate it.

it can be reproduced, with these changes to audit pod.

spec:
      automountServiceAccountToken: true
      containers:
      - args:
        - --operation=audit
        - --operation=status
        - --operation=mutation-status
        - --logtostderr
        - --disable-opa-builtin={http.send}
        - --disable-cert-rotation
        - --audit-interval=60
        - --log-level=INFO
        - --constraint-violations-limit=20
        - --audit-from-cache=true
        - --audit-chunk-size=500
        - --audit-match-kind-only=false
        - --emit-audit-events=false

and create a config with these configuration.

apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
  labels:
    argocd.argoproj.io/instance: gatekeeper
  name: config
  namespace: gatekeeper-system
spec:
  match:
    - excludedNamespaces:
        - kube-*
        - gatekeeper-system
      processes:
        - '*'
    - excludedNamespaces:
        - audit-excluded-ns
      processes:
        - audit
    - excludedNamespaces:
        - audit-webhook-sync-excluded-ns
      processes:
        - audit
        - webhook
        - sync
    - excludedNamespaces:
        - mutation-excluded-ns
      processes:
        - mutation-webhook
  readiness:
    statsEnabled: true
  sync:
    syncOnly:
      - group: ''
        kind: Namespace
        version: v1
      - group: networking.k8s.io
        kind: Ingress
        version: v1
      - group: apps
        kind: Deployment
        version: v1
      - group: ''
        kind: Pod
        version: v1

with these chnages to the default installation ,im able to reproduce the issue and now im with one more new issue .

first issue : The audit pods stopped reporting events even after long delays (though warn or deny or dryrun works just fine as expected)

image

now the constraints status:

NAME                     ENFORCEMENT-ACTION   TOTAL-VIOLATIONS
priorityconstrain-test   warn                 0
{"level":"info","ts":1665484760.973651,"logger":"controller","msg":"Audit opa.Audit() results","process":"audit","audit_id":"2022-10-11T10:39:20Z","violations":0}
{"level":"info","ts":1665484760.9737055,"logger":"controller","msg":"closing the previous audit reporting thread","process":"audit","audit_id":"2022-10-11T10:39:20Z"}
{"level":"info","ts":1665484760.9737248,"logger":"controller","msg":"auditing is complete","process":"audit","audit_id":"2022-10-11T10:39:20Z","event_type":"audit_finished"}
{"level":"info","ts":1665484760.9737687,"logger":"controller","msg":"constraint","process":"audit","audit_id":"2022-10-11T10:39:20Z","resource kind":"k8sRequiredProrityClass"}
{"level":"info","ts":1665484760.9790084,"logger":"controller","msg":"constraint","process":"audit","audit_id":"2022-10-11T10:39:20Z","count of constraints":1}
{"level":"info","ts":1665484760.9790623,"logger":"controller","msg":"starting update constraints loop","process":"audit","audit_id":"2022-10-11T10:39:20Z","constraints to update":"map[{constraints.gatekeeper.sh k8sRequiredProrityClass v1beta1  priorityconstrain-test}:{}]"}
{"level":"info","ts":1665484760.9809113,"logger":"controller","msg":"updating constraint status","process":"audit","audit_id":"2022-10-11T10:39:20Z","constraintName":"priorityconstrain-test"}
{"level":"info","ts":1665484760.9899497,"logger":"controller","msg":"handling constraint update","process":"constraint_controller","instance":{"apiVersion":"constraints.gatekeeper.sh/v1beta1","kind":"k8sRequiredProrityClass","name":"priorityconstrain-test"}}
{"level":"info","ts":1665484820.3655608,"logger":"controller","msg":"auditing constraints and violations","process":"audit","audit_id":"2022-10-11T10:40:20Z","event_type":"audit_started"}
{"level":"info","ts":1665484820.9743826,"logger":"controller","msg":"Auditing from cache","process":"audit","audit_id":"2022-10-11T10:40:20Z"}
{"level":"info","ts":1665484820.97442,"logger":"controller","msg":"Audit opa.Audit() results","process":"audit","audit_id":"2022-10-11T10:40:20Z","violations":0}
{"level":"info","ts":1665484820.9744365,"logger":"controller","msg":"closing the previous audit reporting thread","process":"audit","audit_id":"2022-10-11T10:40:20Z"}
{"level":"info","ts":1665484820.9744458,"logger":"controller","msg":"auditing is complete","process":"audit","audit_id":"2022-10-11T10:40:20Z","event_type":"audit_finished"}
{"level":"info","ts":1665484820.9746115,"logger":"controller","msg":"constraint","process":"audit","audit_id":"2022-10-11T10:40:20Z","resource kind":"k8sRequiredProrityClass"}
{"level":"info","ts":1665484820.9786556,"logger":"controller","msg":"constraint","process":"audit","audit_id":"2022-10-11T10:40:20Z","count of constraints":1}
{"level":"info","ts":1665484820.9787111,"logger":"controller","msg":"starting update constraints loop","process":"audit","audit_id":"2022-10-11T10:40:20Z","constraints to update":"map[{constraints.gatekeeper.sh k8sRequiredProrityClass v1beta1  priorityconstrain-test}:{}]"}
{"level":"info","ts":1665484820.9811964,"logger":"controller","msg":"updating constraint status","process":"audit","audit_id":"2022-10-11T10:40:20Z","constraintName":"priorityconstrain-test"}
{"level":"info","ts":1665484820.989005,"logger":"controller","msg":"handling constraint update","process":"constraint_controller","instance":{"apiVersion":"constraints.gatekeeper.sh/v1beta1","kind":"k8sRequiredProrityClass","name":"priorityconstrain-test"}}

btwseeu78 avatar Oct 11 '22 10:10 btwseeu78