crane-scheduler
crane-scheduler copied to clipboard
Crane scheduler is a Kubernetes scheduler which can schedule pod based on actual node load.
Crane-scheduler
Overview
Crane-scheduler is a collection of scheduler plugins based on scheduler framework, including:
- Dynamic scheduler: a load-aware scheduler plugin
Get Started
1. Install Prometheus
Make sure your kubernetes cluster has Prometheus installed. If not, please refer to Install Prometheus.
2. Configure Prometheus Rules
Configure the rules of Prometheus to get expected aggregated data:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: example-record
labels:
prometheus: k8s
role: alert-rules
spec:
groups:
- name: cpu_mem_usage_active
interval: 30s
rules:
- record: cpu_usage_active
expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[30s])) * 100)
- record: mem_usage_active
expr: 100*(1-node_memory_MemAvailable_bytes/node_memory_MemTotal_bytes)
- name: cpu-usage-5m
interval: 5m
rules:
- record: cpu_usage_max_avg_1h
expr: max_over_time(cpu_usage_avg_5m[1h])
- record: cpu_usage_max_avg_1d
expr: max_over_time(cpu_usage_avg_5m[1d])
- name: cpu-usage-1m
interval: 1m
rules:
- record: cpu_usage_avg_5m
expr: avg_over_time(cpu_usage_active[5m])
- name: mem-usage-5m
interval: 5m
rules:
- record: mem_usage_max_avg_1h
expr: max_over_time(mem_usage_avg_5m[1h])
- record: mem_usage_max_avg_1d
expr: max_over_time(mem_usage_avg_5m[1d])
- name: mem-usage-1m
interval: 1m
rules:
- record: mem_usage_avg_5m
expr: avg_over_time(mem_usage_active[5m])
⚠️Troubleshooting: The sampling interval of Prometheus must be less than 30 seconds, otherwise the above rules(such as cpu_usage_active) may not take effect.
3. Install Crane-scheduler
There are two options:
- Install Crane-scheduler as a second scheduler:
helm repo add crane https://gocrane.github.io/helm-charts helm install scheduler -n crane-system --create-namespace --set global.prometheusAddr="REPLACE_ME_WITH_PROMETHEUS_ADDR" crane/scheduler
- Replace native Kube-scheduler with Crane-scheduler:
- Backup
/etc/kubernetes/manifests/kube-scheduler.yaml
cp /etc/kubernetes/manifests/kube-scheduler.yaml /etc/kubernetes/
- Modify configfile of kube-scheduler(
scheduler-config.yaml
) to enable Dynamic scheduler plugin and configure plugin args:
apiVersion: kubescheduler.config.k8s.io/v1beta2 kind: KubeSchedulerConfiguration ... profiles: - schedulerName: default-scheduler plugins: filter: enabled: - name: Dynamic score: enabled: - name: Dynamic weight: 3 pluginConfig: - name: Dynamic args: policyConfigPath: /etc/kubernetes/policy.yaml ...
- Create
/etc/kubernetes/policy.yaml
, using as scheduler policy of Dynamic plugin:
apiVersion: scheduler.policy.crane.io/v1alpha1 kind: DynamicSchedulerPolicy spec: syncPolicy: ##cpu usage - name: cpu_usage_avg_5m period: 3m - name: cpu_usage_max_avg_1h period: 15m - name: cpu_usage_max_avg_1d period: 3h ##memory usage - name: mem_usage_avg_5m period: 3m - name: mem_usage_max_avg_1h period: 15m - name: mem_usage_max_avg_1d period: 3h predicate: ##cpu usage - name: cpu_usage_avg_5m maxLimitPecent: 0.65 - name: cpu_usage_max_avg_1h maxLimitPecent: 0.75 ##memory usage - name: mem_usage_avg_5m maxLimitPecent: 0.65 - name: mem_usage_max_avg_1h maxLimitPecent: 0.75 priority: ##cpu usage - name: cpu_usage_avg_5m weight: 0.2 - name: cpu_usage_max_avg_1h weight: 0.3 - name: cpu_usage_max_avg_1d weight: 0.5 ##memory usage - name: mem_usage_avg_5m weight: 0.2 - name: mem_usage_max_avg_1h weight: 0.3 - name: mem_usage_max_avg_1d weight: 0.5 hotValue: - timeRange: 5m count: 5 - timeRange: 1m count: 2
- Modify
kube-scheduler.yaml
and replace kube-scheduler image with Crane-scheduler:
... image: docker.io/gocrane/crane-scheduler:0.0.23 ...
- Install crane-scheduler-controller:
kubectl apply ./deploy/controller/rbac.yaml && kubectl apply -f ./deploy/controller/deployment.yaml
- Backup
4. Schedule Pods With Crane-scheduler
Test Crane-scheduler with following example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: cpu-stress
spec:
selector:
matchLabels:
app: cpu-stress
replicas: 1
template:
metadata:
labels:
app: cpu-stress
spec:
schedulerName: crane-scheduler
hostNetwork: true
tolerations:
- key: node.kubernetes.io/network-unavailable
operator: Exists
effect: NoSchedule
containers:
- name: stress
image: docker.io/gocrane/stress:latest
command: ["stress", "-c", "1"]
resources:
requests:
memory: "1Gi"
cpu: "1"
limits:
memory: "1Gi"
cpu: "1"
Note: Change
crane-scheduler
todefault-scheduler
ifcrane-scheduler
is used as default.
There will be the following event if the test pod is successfully scheduled:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 28s crane-scheduler Successfully assigned default/cpu-stress-7669499b57-zmrgb to vm-162-247-ubuntu
Compatibility Matrix
Scheduler Image Version | Supported Kubernetes Version |
---|---|
0.0.23 | >=1.22.0 |
0.0.20 | >=1.18.0 |
The default scheudler image version is 0.0.23
, and you can run the following command for quick replacement:
KUBE_EDITOR="sed -i 's/v1beta2/v1beta1/g'" kubectl edit cm scheduler-config -n crane-system && KUBE_EDITOR="sed -i 's/0.0.23/0.0.20/g'" kubectl edit deploy crane-scheduler -n crane-system