ThreatMapper icon indicating copy to clipboard operation
ThreatMapper copied to clipboard

Provide Kubernetes Admission Controller

Open ogarrett opened this issue 2 years ago • 6 comments

An 'Admission Controller' is a Kubernetes procedure that is run on each API call. Admission controllers can validate API calls, or modify (mutate) the call. See https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/ for more information.

ThreatMapper logic should be provided as a validating Admission Controller. This capability could be used by the administrator to prevent the deployment of pods if their constituent containers fail to meet security standards:

  • Container has too many high-severity, network-exploitable vulnerabilities
  • Container has exposed secrets that are not allow-listed

Detailed Requirements

Logging

We MUST consider how this functions in a 'lights-out' environment where the primary interaction is with Kubernetes logs. Not all affected parties will have access to the ThreatMapper console or logs. Kubernetes logs should contain all of the information necessary to monitor and troubleshoot the admission controller, including the vulnerabilities/secret rules that caused a pod deployment to be blocked.

We MUST implement a 'log only' mode that can be used to test the admission controller. This mode should inspect and score pod deployments in the standard manner, but log a high-priority alert ("Pod that fails to meet security standards is being deployed") rather than blocking the deployment. This mode will be critical for users to test the admission controller before enabling blocking. In a simple implementation, the 'log or block' mode could be a simple global setting for the admission controller (see 'fine tuning' below).

Exceptions

Some of the threats detected by ThreatMapper may be legitimate false positives. Examples:

  • Organisation has a policy to use certain software dependencies, even though they have known vulnerabilities and are detected by the threatmapper vulnerability scan
  • SecretScanning can identify high-severity issues e.g. AWS token that may be necessary in certain circumstances and should not be blocked

We MUST allow administrators to define an allow-list of exceptions. If an exception is allowed, we SHOULD log this for audit and troubleshooting purposes.

In emergency, break glass

We SHOULD allow an administrator to enable an 'in emergency, break glass' process. If a pod deployment is blocked, a user should be able to invoke this process to override the admission controller.

For example, an administrator may define a policy that pods with the label "deepfenceoptions=donotblock' are scanned, logged but the deployment is not blocked.

This would allow a user to force-deploy a pod (an exceptional event).

The 'break glass' procedure MUST NOT be enabled by default. It MUST be implemented with a custom (admin-defined) label, not with a default label, so that only users in-the-know are able to invoke this procedure.

Fine Tuning

We COULD offer a fine degree of control, whereby administrators can define multiple thresholds (number of critical or high vulnerabilities, total severity etc) and corresponding actions (BLOCK, LOG, Custom Log Message (required for routing)).

We COULD further tune these rules by associating them with selectors, so an administrator can control which rules apply to which pod deployments.

We COULD allow an administrator fine-grained control over the exceptions that are allowed, for example, by associating lists of exceptions with selectors.

ogarrett avatar Jul 06 '22 11:07 ogarrett

  • What are admission webhooks?

  • Kubernetes requires communication to webhooks be encrypted

  • webhook request and response formats

  • connecting to the webhook

  • webhook response

    • uid, copied from the request.uid sent to the webhook
    • allowed, either set to true or false
    {
       "apiVersion": "admission.k8s.io/v1",
       "kind": "AdmissionReview",
       "response": {
           "uid": "<value from request.uid>",
           "allowed": true
       }
    }
    
  • When rejecting a request, the webhook can customize the http code and message returned to the user using the status field.

    {
      "apiVersion": "admission.k8s.io/v1",
      "kind": "AdmissionReview",
      "response": {
        "uid": "<value from request.uid>",
        "allowed": false,
        "status": {
          "code": 403,
          "message": "You cannot do this because it is Tuesday and your name starts with A"
        }
      }
    }
    
  • To register validation webhooks, create ValidatingWebhookConfiguration

  • ValidatingWebhookConfiguration API object

gnmahanth avatar Sep 12 '22 10:09 gnmahanth

Design Proposal

  1. Deployment

    • create a admission web-hook server as part of Kubernetes deepfence agent or standalone admission controller
    • use admission controller to connect and cache required data from Deepfence console and serve admission responses based the data from console
  2. Data Source

    • use image scan results from Console
  3. Policies

    • default rules
      • allow all and log violations
      • allow image if no critical vulnerabilities exists
      • allow/block based on allow/block list
      • allow/block based on list of vulnerabilities
    • custom rules by users
      • create policies applicable to namespace, image, registry
      • create policies with list of vulnerabilities to allow or block
  4. Controls

    • annotations can be used to ignore/enforce policy
    • types of controls ignore/enforce/audit
    • default action is to enforce policies
    • ~~deepfence.io/webhook: enforce -> enforce all policies~~
    • deepfence.io/webhook: ignore -> ignore namespace/pod/deployment
    • deepfence.io/webhook: audit -> audit and log all violations don't block
    • ~~namespaceSelector selector can be used to target namespaces while deploying admission webhook~~
  5. Notifications

    • notify to slack or any other configured integration about blocked image with reason
  6. Database Table

    • Name: admission_controller_policies
    • Columns: TBD
  7. Flow Diagram admission-controller (2)

gnmahanth avatar Sep 13 '22 11:09 gnmahanth

Workflow

  • create policies and then associate them to clusters connected with Deepfence console
  • allows user to have different polices between environments like dev/qa/test/prod

Policies

  • actions list
    • allow
    • deny
    • audit
  • actions priority
    • always evaluate deny policies before evaluating audit/allow policies
  • list of key-value for multiple conditions and actions, for example:
{"id":1,"conditions":[{"key":"namespace","value":"default"}],"action":"deny"}
{"id":2,"conditions":[{"key":"image","value":"nginx:latest"}],"action":"deny"}
{"id":3,"conditions":[{"key":"namespace","value":"kube-system"}],"action":"allow"}
{"id":4,"conditions":[{"key":"namespace","value":"default"},{"key":"image","value":"quay.io"}],"action":"allow"}
{"id":5,"conditions":[{"key":"namespace","value":"default"}],"action":"audit"}
  • all values are sub-string match
  • ~~do we need to support regex match?~~
  • associate one or more of the above policies to a cluster connected to deepfence console, for example:
{"cluster":1,"policies":[1,2,3]}
{"cluster":2,"policies":[4,5]}

gnmahanth avatar Sep 14 '22 10:09 gnmahanth

  • Policy when vulnerability, threshold or allow/block list are defined
  • for example, allow images with no critical cve's on default namespace
{
  "id": 5,
  "conditions": [
    {
      "key": "namespace",
      "value": "default"
    }
  ],
  "vulnerabilities": {
    "type": "cve",
    "severity": "critical", // optional not used when one of allow/deny list is provided
    "threshold": 0, // optional not used when one of allow/deny list is provided
    "allow_list": [], // optional list of vulnerabilities that are allowed, not used when severity and threshold are defined
    "deny_list": [] // optional list of vulnerabilities that are not allowed,  not used when severity and threshold are defined
  },
  "action": "deny"
}

gnmahanth avatar Sep 14 '22 11:09 gnmahanth

Tasks:

  • Database
    • admission_controller_policies
      • id - primary key
      • name - string, policy name
      • conditions - json field
      • vulnerabilities - json field
    • ac_policies_associations
      • id - primary key
      • cluster_id/name - string
      • policies - foreign key referring to policy id
  • API server
    • add API's for CRUD operations for policies
    • add API's for associating policies with clusters
    • add API to get policies associated to a cluster by agent
    • add API to get all vulnerabilities by image by agent
  • Agent
    • add new admission controller
    • sync policies from console
    • fetch image vulnerability data from console
    • provide webhook for admission controller
  • UI
    • add option to create new policies
    • add option to associate policies to cluster

gnmahanth avatar Sep 14 '22 12:09 gnmahanth

development branch: https://github.com/deepfence/ThreatMapper/tree/admission-webhook

gnmahanth avatar Dec 02 '22 05:12 gnmahanth