devices
devices copied to clipboard
vgpu 并发调度pod时,显存混乱
执行下面的命令,同时调度2个pod,一个分配24576M显存,一个分配600M显存,pod起来后进入容器使用nvidia-smi查看,发现两者的显存是反的,给容器ubuntu-container-24576分配了600M显存,给容器ubuntu-container-600分配了24576显存
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod-1v-24576-1
spec:
schedulerName: volcano
containers:
- name: ubuntu-container-24576
image: ubuntu:18.04
command: ["bash", "-c", "sleep 86400"]
resources:
limits:
volcano.sh/vgpu-number: 1 # requesting 1 vGPUs
volcano.sh/vgpu-memory: 24576
---
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod-1v-600-1
spec:
schedulerName: volcano
containers:
- name: ubuntu-container-600
image: ubuntu:18.04
command: ["bash", "-c", "sleep 86400"]
resources:
limits:
volcano.sh/vgpu-number: 1 # requesting 1 vGPUs
volcano.sh/vgpu-memory: 600
EOF