aws-virtual-gpu-device-plugin
aws-virtual-gpu-device-plugin copied to clipboard
recreated pod doesn't get GPU memory
apiVersion: apps/v1
kind: Deployment
metadata:
name: objectdetection-deployment
namespace: iinumbers
labels:
deployment: objectdetection-deployment
spec:
replicas: 2
selector:
matchLabels:
app: objectdetection-server
template:
metadata:
labels:
app: objectdetection-server
spec:
# hostIPC is required for MPS communication
hostIPC: true
containers:
- name: objectdetection-container
image: 192.168.99.150:5000/objectdetectionapp:v0.1.0
env:
- name: CALLBACK_URL
value: "http://facerecognition-service:8001"
- name: PER_PROCESS_GPU_MEMORY_FRACTION
value: "0.23"
ports:
- containerPort: 8888
# Use virtual gpu resource here
resources:
limits:
k8s.amazonaws.com/vgpu: 1
volumeMounts:
- name: nvidia-mps
mountPath: /tmp/nvidia-mps
volumes:
- name: nvidia-mps
hostPath:
path: /tmp/nvidia-mps
---
apiVersion: v1
kind: Service
metadata:
labels:
service: objectdetection-service
name: objectdetection-service
namespace: iinumbers
spec:
ports:
- port: 8888
targetPort: 8888
nodePort: 30000
selector:
app: objectdetection-server
type: NodePort
Above content is the yaml file for deployment and service. I've set replicas to be 2 for the deployment. However, as I destroyed one of the pods, I found that although another new pod was created, it didn't get any GPU memory from watching the result of nvidia-smi
.