chaos-controller icon indicating copy to clipboard operation
chaos-controller copied to clipboard

Chaos controller for Kubernetes

Chaos Controller

The Chaos Controller provides a controller for chaos testing in Kubernetes and supports a rich set of supported failure scenarios. It relies in Linux and Docker functionality to inject network partitions and stress nodes.

  • Setup
    • Helm
    • Manual setup
  • Usage
    • Scheduling
    • Selectors
    • Crash monkey
    • Partition monkey
    • Stress monkey
  • Architecture
    • Controller
      • Crash controller
      • Partition controller
      • Stress controller
    • Workers
      • Crash workers
      • Partition workers
      • Stress workers

Setup

Helm

A Helm chart is provided for setting up the controller. To deploy the controller use helm install helm from the project root:

helm install helm

When the chart is installed, following custom resources will be added to the cluster:

  • ChaosMonkey
  • Crash
  • NetworkPartition
  • Stress

The ChaosMonkey resource is the primary resource provided by the controller. The remaining custom resources are used by the controller to inject specific failures into pods.

The chart supports overrides for both the controller and workers. The controller is deployed as a Deployment, and the workers as a DaemonSet.

Manual Setup

Before running the controller, register the custom resources:

$ kubectl create -f deploy/chaosmonkey.yaml
$ kubectl create -f deploy/crash.yaml
$ kubectl create -f deploy/networkpartition.yaml
$ kubectl create -f deploy/stress.yaml

Setup RBAC and deploy the controller:

$ kubectl create -f deploy/service_account.yaml
$ kubectl create -f deploy/role.yaml
$ kubectl create -f deploy/role_binding.yaml
$ kubectl create -f deploy/controller.yaml

Deploy the workers:

$ kubectl create -f deploy/workers.yaml

Example ChaosMonkey resources can be found in the example directory:

$ kubectl create -f example/crash_monkey.yaml
$ kubectl create -f example/partition_monkey.yaml
$ kubectl create -f example/stress_monkey.yaml

Usage

The chaos controller provides a full suite of tools for chaos testing, injecting a variety of failures into the nodes and in the k8s pods and networks. Each monkey plays a specific role in injecting failures into the cluster:

apiVersion: chaos.atomix.io/v1alpha1
kind: ChaosMonkey
metadata:
  name: crash-monkey
spec:
  rateSeconds: 60
  jitter: .5
  crash:
    crashStrategy:
      type: Container

Scheduling

The scheduling of periodic ChaosMonkey executions can be managed by providing a rate and period for which the fault occurs:

  • rateSeconds - the number of seconds to wait between monkey runs
  • periodSeconds - the number of seconds for which to run a monkey, e.g. the amount of time for which to partition the network or stress a node
  • jitter - the amount of jitter to apply to the rate

Selectors

Specific sets of pods can be selected using pod names, labels, or match expressions specified in the configured selector:

  • matchPods - a list of pod names on which to match
  • matchLabels - a map of label names and values on which to match pods
  • matchExpressions - label match expressions on which to match pods

Selector options can be added on a per-monkey basis:

apiVersion: chaos.atomix.io/v1alpha1
kind: ChaosMonkey
metadata:
  name: crash-monkey
spec:
  crash:
    crashStrategy:
      type: Pod
  selector:
    matchPods:
    - pod-1
    - pod-2
    - pod-3
    matchLabels:
      group: raft
    matchExpressions:
    - key: group
      operator: In
      values:
      - raft
      - data

Each monkey type has a custom configuration provided by a named field for the monkey type:

  • crash
  • partition
  • stress

Crash monkey

The crash monkey can be used to inject node crashes into the cluster. To configure a crash monkey, use the crash configuration:

apiVersion: chaos.atomix.io/v1alpha1
kind: ChaosMonkey
metadata:
  name: crash-monkey
spec:
  rateSeconds: 60
  jitter: .5
  crash:
    crashStrategy:
      type: Container

The crash configuration supports a crashStrategy with the following options:

  • Container - kills the process running inside the container
  • Pod - deletes the Pod using the Kubernetes API

Partition monkey

The partition monkey can be used to cut off network communication between a set of pods. To use the partition monkey, selected pods must have iptables installed. To configure a partition monkey, use the partition configuration:

apiVersion: chaos.atomix.io/v1alpha1
kind: ChaosMonkey
metadata:
  name: partition-isolate-monkey
spec:
  rateSeconds: 600
  periodSeconds: 120
  partition:
    partitionStrategy:
      type: Isolate

The partition configuration supports a partitionStrategy with the following options:

  • Isolate - isolates a single random node in the cluster from all other nodes
  • Halves - splits the cluster into two halves
  • Bridge - splits the cluster into two halves with a single bridge node able to communicate with each half (for testing consensus)

Stress monkey

The stress monkey uses a variety of tools to simulate stress on nodes and on the network. To configure a stress monkey, use the stress configuration:

apiVersion: chaos.atomix.io/v1alpha1
kind: ChaosMonkey
metadata:
  name: stress-cpu-monkey
spec:
  rateSeconds: 300
  periodSeconds: 300
  stress:
    stressStrategy:
      type: All
    cpu:
      workers: 2

The stress configuration supports a stressStrategy with the following options:

  • Random - applies stress options to a random pod
  • All - applies stress options to all pods in the cluster

The stress monkey supports a variety of types of stress using the stress tool:

  • cpu - spawns cpu.workers workers spinning on sqrt()
  • io - spawns io.workers workers spinning on sync()
  • memory - spawns memory.workers workers spinning on malloc()/free()
  • hdd - spawns hdd.workers workers spinning on write()/unlink()
apiVersion: chaos.atomix.io/v1alpha1
kind: ChaosMonkey
metadata:
  name: stress-all-monkey
spec:
  rateSeconds: 300
  periodSeconds: 300
  stress:
    stressStrategy:
      type: Random
    cpu:
      workers: 2
    io:
      workers: 2
    memory:
      workers: 4
    hdd:
      workers: 1

Additionally, network latency can be injected using the stress monkey via traffic control by providing a network stress configuration:

  • latencyMilliseconds - the amount of latency to inject in milliseconds
  • jitter - the jitter to apply to the latency
  • correlation - the correlation to apply to the latency
  • distribution - the delay distribution, either normal, pareto, or paretonormal
apiVersion: chaos.atomix.io/v1alpha1
kind: ChaosMonkey
metadata:
  name: stress-network-monkey
spec:
  rateSeconds: 300
  periodSeconds: 60
  stress:
    stressStrategy:
      type: All
    network:
      latencyMilliseconds: 500
      jitter: .5
      correlation: .25

Architecture

The controller consists of two independent components which run as containers in a k8s cluster.

Controller

The controller is the component responsible for monitoring the creation/deletion of ChaosMonkey resources, scheduling executions, and distributing tasks to workers. The controller typically runs as a Deployment. When multiple replicas are run, only a single replica will control the cluster at any given time.

When ChaosMonkey resources are created in the k8s cluster, the controller receives a notification and, in response, schedules a periodic background task to execute the monkey handler. The periodic task is configured based on the monkey configuration. When a monkey handler is executed, the controller filters pods using the monkey's configured selectors and passes the pods to the handler for execution. Monkey handlers then assign tasks to specific workers to carry out the specified chaos function.

Controller

Crash controller

The crash controller assigns tasks to workers via Crash resources. The Crash resource will indicate the podName of the pod to crash and the crashStrategy with which to crash the pod.

Partition controller

The partition controller assigns tasks to workers according to the configured partitionStrategy and uses the NetworkPartition resource to communicate details of the network partition to the workers. After determining the set of routes to cut off between pods, the controller creates a NetworkPartition for each source/destination pair.

Network Partition

Stress controller

The stress controller assigns tasks to workers via Stress resources. The Stress resource will indicate the podName of the pod to stress and the mechanisms with which to stress the pod.

Workers

Workers are the components responsible for injecting failures on specific k8s nodes. Like the controller, workers are a type of resource controller, but rather than managing the high level ChaosMonkey resources used to randomly inject failures, workers provide for the injection of pod-level failures in response to the creation of resources like Crash, NetworkPartition, or Stress.

In order to ensure a worker is assigned to each node and can inject failures into the OS, workers must be run in a DaemonSet and granted privileged access to k8s nodes.

Crash workers

Crash workers monitor k8s for the creation of Crash resources. When a Crash resource is detected, if the podName contained in the Crash belongs to the node on which the worker is running, the worker executes the crash. This ensures only one node attempts to execute a crash regardless of the method by which the crash is performed.

The method of execution of the crash depends on the configured crashStrategy. If the Pod strategy is used, the worker simply deletes the pod via the Kubernetes API. If the Container strategy is indicates, the worker locates the pod's container(s) via the Docker API and kills the containers directly.

Partition workers

Partition workers monitor k8s for the creation of NetworkPartition resources. Each NetworkPartition represents a link between two pods to be cut off while the resource is running. The worker configures the pod indicated by podName to drop packets from the configured sourceName. NetworkPartition resources may be in one of four phases:

  • started indicates the resource has been created but the pods are not yet partitioned
  • running indicates the pod has been partitioned
  • stopped indicates the partition has been stopped but the physical communication has not yet been restored
  • complete indicates communication between the pod and the source has been restored

When a worker receives notification of a NetworkPartition in the started phase, if the pod is running on the worker's node, the worker cuts off communication between the pod and the source by locating the virtual network interface for the pod and adding firewall rules to the host to drop packets received on the pod's virtual interface from the source IP. To restore communication with the source, the worker simply deletes the added firewall rules.

Stress workers

Stress workers monitor k8s for the creation of Stress resources. When a Stress resource is detected, the worker on the node to which the stressed pod is assigned may perform several tasks to stress the desired pod.

For I/O, CPU, memory, and HDD stress, the worker will create a container in the pod's namespace to execute the stress tool. For each configured stress option, a separate container will be created to stress the pod. When the Stress resource is stopped, all stress containers will be stopped.

For network stress, the host is configured using the traffic control utility to control the pod's virtual interfaces. When the Stress resource is stopped, the traffic control rule is deleted.