nri-plugins
nri-plugins copied to clipboard
[proto]: bolt a DRA driver frontend on the topology-aware policy.
This prototype patch set bolts a DRA allocation frontend on top of the existing topology aware resource policy plugin. The main intention with of this patch set is
- provide something practical to play around with for the feasibility study of enabling DRA-based CPU allocation,
- allow (relatively) easy experimentation with how to expose CPU as DRA devices (IOW test various CPU DRA attributes)
- allow testing how DRA-based CPU allocation (using non-trivial CEL expressions) would scale with cluster and cluster node size
Notes: This patched NRI plugin, especially in its current state and form, is not a proposal for a first real DRA-based CPU driver.
If you want to play around with this (for instance modify the exposed CPU abstraction), the easiest way is to
- fork the main NRI Reference Plugins repo
- enable github actions in your personal fork
- make any changes you want (for instance, to alter the CPU abstraction, take a look at cpu.DRA()
- Push your changes to ssh://[email protected]/$YOUR_FORK/nri-plugins/refs/heads/test/build/dra-driver.
- Wait for the image and Helm chart publishing actions to succeed
- Once done, you can pull the result in to your cluster with something like
helm install --devel -n kube-system test oci://ghcr.io/$YOUR_GITHUB_USERID/nri-plugins/helm-charts/nri-resource-policy-topology-aware --version v0.9-dra-driver-unstable
You can then test if things work with something like
apiVersion: resource.k8s.io/v1beta1
kind: ResourceClaimTemplate
metadata:
name: any-cores
spec:
spec:
devices:
requests:
- name: cpu
deviceClassName: native.cpu
---
apiVersion: resource.k8s.io/v1beta1
kind: ResourceClaimTemplate
metadata:
name: p-cores
spec:
spec:
devices:
requests:
- name: cpu
deviceClassName: native.cpu
selectors:
- cel:
expression: device.attributes["native.cpu"].coreType == "P-core"
count: 1
---
apiVersion: resource.k8s.io/v1beta1
kind: ResourceClaimTemplate
metadata:
name: e-cores
spec:
spec:
devices:
requests:
- name: cpu
deviceClassName: native.cpu
selectors:
- cel:
expression: device.attributes["native.cpu"].coreType == "E-core"
count: 1
---
apiVersion: v1
kind: Pod
metadata:
name: pcore-test
labels:
app: pod
spec:
containers:
- name: ctr0
image: busybox
imagePullPolicy: IfNotPresent
args:
- /bin/sh
- -c
- trap 'exit 0' TERM; sleep 3600 & wait
resources:
requests:
cpu: 1
memory: 100M
limits:
cpu: 1
memory: 100M
claims:
- name: claim-pcores
resourceClaims:
- name: claim-pcores
resourceClaimTemplateName: p-cores
terminationGracePeriodSeconds: 1