gatekeeper icon indicating copy to clipboard operation
gatekeeper copied to clipboard

Using Gatekeeper to Lock Down Google Config Connector (and other CRDs)

Open james-mcgoodwin opened this issue 4 years ago • 5 comments

I am trying to setup GK to govern what sort of GCP resources our developers are allowed to create via the Google Config Connector controller.

This controller interacts with GCP using a google service account that is typically an owner (aka most powerful privilege) of the project it's allowed to administer. That insane permission allowance is part of their own documentation for installation.

So I want to use OPA Gatekeeper to restrict that for obvious reasons. I can take the other approach of restricting the Config Connector user in GCP, but I'd like to try Gatekeeper first.

My objective then is to specify several different 'allowed' Config Connector resource types. Only resources made to specific APIs and Types should be allowed through. All other Config Connector resource types should be denied. All OTHER resource types should not be blocked.

Config Connector details

  • Config Connector is installed as a huge set of CRDs
$  kubectl get crds | grep cnrm | wc -l
      84
  • Each CRD shares a similar API name, always containing cnrm.cloud.google.com
$  kubectl get crds | grep cnrm | tail -n 5
sqlusers.sql.cnrm.cloud.google.com                                              2020-04-17T19:52:51Z
storagebucketaccesscontrols.storage.cnrm.cloud.google.com                       2020-04-17T19:52:51Z
storagebuckets.storage.cnrm.cloud.google.com                                    2020-04-17T19:52:51Z
storagedefaultobjectaccesscontrols.storage.cnrm.cloud.google.com                2020-04-17T19:52:52Z
storagenotifications.storage.cnrm.cloud.google.com

Approach

So then my approach is the following template and constraint:

And my objective is that I want Gatekeeper to allow resources of kind: ComputeURLMap while blocking all other Config Connector resource types, but not block anything else.

((Never mind that it's not especially useful to only permit ComputeURLMap and nothing else, it's a handy example here))

Gatekeeper Template and Constraint

I have applied the following template and constraint to get the above outcome (template.yaml)

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: allowedcnrms
spec:
  crd:
    spec:
      names:
        kind: AllowedCNRMs
        listKind: AllowedCNRMsList
        plural: allowedcnrms
        singular: allowedcnrm
      validation:
        # Schema for the `parameters` field
        openAPIV3Schema:
          properties:
            kinds:
              type: array
              items:
                type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package allowedcnrm

        violation[{"msg": msg, "details": {"Resource denied": api}}] {
          api := input.review.object.kind
          allowedapis := [good | okapis = input.parameters.kinds[_] ; good = contains(api ,okapis) ]
          not any(allowedapis)
          msg := sprintf("You may not request a config connector resource for %v", [api])
        }

(constraint.yaml)

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: AllowedCNRMs
metadata:
  name: allow-only-needed-config-controller-apis
spec:
  enforcementAction: deny
  match:
    kinds:
      - apiGroups: ["*.cnrm.cloud.google.com"]
  parameters:
    kinds:
      - "ComputeURLMap"

Resource I expect to be rejected

And I apply this resource for making a GCE disk to kubernetes with the expectation that it will be denied

---
apiVersion: compute.cnrm.cloud.google.com/v1beta1
kind: ComputeDisk
metadata:
  name: test-persistent-disk-us-west2
  namespace: testing-namespace
  annotations:
    cnrm.cloud.google.com/project-id: my-gcp-project
spec:
  #description: A 200GB na-ne1 regional disk for testing
  location: us-west2
  replicaZones:
    - "us-west2-a"
    - "us-west2-b"
  size: 200

input.review.object of the above resource

I've tried to see if I have a problem in the document structure I'm trying to constrain against by using the deny-all method in the README.md. Looking at the below output it looks like I'm trying to reference all the correct path in the input.review.object document:

input_review_object.json

{
  "userInfo": {
    "username": "<redacted>",
    "groups": [
      "system:authenticated"
    ],
    "extra": {
      "user-assertion.cloud.google.com": [
        "<redacted>"
      ]
    }
  },
  "object": {
    "apiVersion": "compute.cnrm.cloud.google.com/v1beta1",
    "kind": "ComputeDisk",
    "metadata": {
      "annotations": {
        "cnrm.cloud.google.com/management-conflict-prevention-policy": "resource",
        "cnrm.cloud.google.com/project-id": "my-gcp-project",
        "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"compute.cnrm.cloud.google.com/v1beta1\",\"kind\":\"ComputeDisk\",\"metadata\":{\"annotations\":{\"cnrm.cloud.google.com/project-id\":\"my-gcp-project\"},\"name\":\"test-persistent-disk-us-west2\",\"namespace\":\"testing-namespace\"},\"spec\":{\"location\":\"us-west2\",\"replicaZones\":[\"us-west2-a\",\"us-west2-b\"],\"size\":200}}\n"
      },
      "creationTimestamp": "2020-04-20T20:22:54Z",
      "generation": 1,
      "name": "test-persistent-disk-us-west2",
      "namespace": "testing-namespace",
      "uid": "b742dcc5-8344-11ea-b83d-42010aea6506"
    },
    "spec": {
      "replicaZones": [
        "us-west2-a",
        "us-west2-b"
      ],
      "size": 200,
      "location": "us-west2"
    }
  },
  "dryRun": false,
  "_unstable": {
    "namespace": {
      "spec": {
        "finalizers": [
          "kubernetes"
        ]
      },
      "status": {
        "phase": "Active"
      },
      "kind": "Namespace",
      "apiVersion": "v1",
      "metadata": {
        "name": "testing-namespace",
        "selfLink": "/api/v1/namespaces/testing-namespace",
        "uid": "a43000ff-80e5-11ea-8533-42010aea6508",
        "resourceVersion": "7534",
        "creationTimestamp": "2020-04-17T19:57:17Z"
      }
    }
  },
  "resource": {
    "resource": "computedisks",
    "group": "compute.cnrm.cloud.google.com",
    "version": "v1beta1"
  },
  "namespace": "testing-namespace",
  "operation": "CREATE",
  "oldObject": null,
  "options": null,
  "uid": "b742f844-8344-11ea-b83d-42010aea6506",
  "kind": {
    "group": "compute.cnrm.cloud.google.com",
    "version": "v1beta1",
    "kind": "ComputeDisk"
  }
}

Result

Gatekeeper does not prevent this resource. I know my GK installation is working because I have a different constrain that IS working when I test it.

The above resource does not even appear in the audit:

{"level":"info","ts":1587478231.7550147,"logger":"controller","msg":"handling constraint update","process":"constraint_controller","instance":{"apiVersion":"constraints.gatekeeper.sh/v1beta1","kind":"AllowedCNRMs","metadata":{"annotations":{"kubectl.kubernetes.io/last-applied-configuration":"{"apiVersion":"constraints.gatekeeper.sh/v1beta1","kind":"AllowedCNRMs","metadata":{"annotations":{},"name":"allow-only-needed-config-controller-apis"},"spec":{"enforcementAction":"deny","match":{"kinds":[{"apiGroups":["compute.cnrm.cloud.google.com"]}]},"parameters":{"kinds":["ComputeURLMap"]}}}\n"},"creationTimestamp":"2020-04-21T13:27:15Z","generation":1,"name":"allow-only-needed-config-controller-apis","resourceVersion":"1384336","selfLink":"/apis/constraints.gatekeeper.sh/v1beta1/allowedcnrms/allow-only-needed-config-controller-apis","uid":"d1225ec2-83d3-11ea-b83d-42010aea6506"},"spec":{"enforcementAction":"deny","match":{"kinds":[{"apiGroups":["compute.cnrm.cloud.google.com"]}]},"parameters":{"kinds":["ComputeURLMap"]}},"status":{"auditTimestamp":"2020-04-21T14:10:21Z","byPod":[{"enforced":true,"id":"gatekeeper-controller-manager-675dd8db8-464ts","observedGeneration":1},{"enforced":true,"id":"gatekeeper-audit","observedGeneration":1}],"totalViolations":0}}}

Questions

  1. Is what I'm trying to do possible?
  2. Are wildcards allowed in the constraint.yaml? ie:
spec:
  enforcementAction: deny
  match:
    kinds:
      - apiGroups: ["*.cnrm.cloud.google.com"]
  1. Have I missed any question I should be asking?

Thanks very much for reading this far, it's a bunch, but I've been bashing my head against this for a couple of days now. Any help offered would be really appreciated!

james-mcgoodwin avatar Apr 21 '20 14:04 james-mcgoodwin

  1. What you are trying to do should be possible.

Although you should know that Gatekeeper currently fails-open while we are building out our reliability story in order to avoid becoming a bottleneck for cluster availability. You can configure the ValidatingWebhookConfiguration to fail closed, but other issues may cause enforcement to not be 100% (ex. constraint not enforced during cache warmup). Missed violations would be expected to show up in audit. Currently, restricting the service account likely would yield stronger enforcement, though regular auditing is always recommended. More info about fail open.

  1. Wildcards are not possible in the match semantics (other than a lone "*" for match everything). Since glob-style matching doesn't appear to impact query optimization, it may be worth considering adding it, though that would cause us to diverge from Role/ClusterRole semantics. In the interim, you could build such kind matching into your ConstraintTemplate.

Here is how we implement our GroupKind matching logic:

https://github.com/open-policy-agent/gatekeeper/blob/c4c443ee56a7ad56e2486800f5a5d9f3832cb405/pkg/target/regolib/src.rego#L120-L149

maxsmythe avatar Apr 22 '20 03:04 maxsmythe

I have a few config connector policies so figured I'd throw in my two cents.

I just keep the list in the constraint up to date myself. You can run: kg crds | grep cnrm | awk '{print $1}' | cut -d'.' -f2- | sort | uniq | wc -l (which can probably be shortened but not claiming to be a shell expert,) and you'll see it's shorter than the near 100 resources (I currently have 29 api groups). After that you just wildcard all kinds.

And the template option is not too crazy either (we make everything have matching though.) Something like:

# Setup kind
provided_kind := input.review.kind.kind
kind_regex := "^ComputeURLMap$" # in case you want to do ORs
# Setup group
provided_group := input.review.object.kind.group
group_regex := "^.*.cnrm.cloud.google.com$"
# Only cnrm resources
re_match(group_regex, provided_group)
# Deny all except ones we specify
not re_match(kind_regex, provided_kind)
msg := "Only some cnrm resources are allowed"

snuggie12 avatar Jan 20 '21 04:01 snuggie12

@snuggie12 : I have faced a similar issue, we wanted to whitelist multiple AWS ECR specific registries (we have multiple). For us, account ID and region should be dynamic as this will get applied to multiple different environments and scenarios hence i am forced to use wildcard for my registries *.dkr.ecr.*.amazonaws.com

As you might have guessed, OPA is not recognizing and is allowing all Images (from ECR and non-ecr). Any thoughts?

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAllowedRepos
metadata:
  name: repo-is-openpolicyagent
spec:
  enforcementAction: dryrun
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod", "job"]
    namespaces:
      - "default"
  parameters:
    repos:
      - "*.dkr.ecr.*.amazonaws.com/" 

shomeprasanjit avatar Jul 12 '21 18:07 shomeprasanjit

@shomeprasanjit I'm not a maintainer or anything, but you can use regex instead of starts with or whatever:

# I wrote this on the fly. not promising it will work.
# You can use regex101.com to test and you can google DNS regex patterns instead of my weak ".+"
"^.+\.dkr\.ecr\..+\.amazonaws\.com/.+$"

snuggie12 avatar Jul 12 '21 22:07 snuggie12

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jul 23 '22 06:07 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Oct 11 '22 02:10 stale[bot]