gpu-manager
gpu-manager copied to clipboard
pod状态UnexpectedAdmissionError
pod的yaml如下:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: vcuda-test
qcloud-app: vcuda-test
name: vcuda-test
namespace: default
spec:
replicas: 1
selector:
matchLabels:
k8s-app: vcuda-test
template:
metadata:
labels:
k8s-app: vcuda-test
qcloud-app: vcuda-test
spec:
containers:
- command:
- sleep
- 360000s
env:
- name: PATH
value: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
image: ccr.ccs.tencentyun.com/menghe/tensorflow-gputest:0.2
imagePullPolicy: IfNotPresent
name: tensorflow-test
resources:
limits:
cpu: "1"
memory: 8Gi
tencent.com/vcuda-core: "50"
# tencent.com/vcuda-memory: "32"
requests:
cpu: "1"
memory: 8Gi
tencent.com/vcuda-core: "50"
# tencent.com/vcuda-memory: "32"
当我注释掉vcuda-memory 这一行时,我apply应用pod,pod状态为UnexpectedAdmissionError,describe pod得到如下信息:
Reason: UnexpectedAdmissionError
Message: Pod Allocate failed due to rpc error: code = Unknown desc = candidate pod not found for request &AllocateRequest{ContainerRequests:[&ContainerAllocateRequest{DevicesIDs:[tencent.com/vcuda-core-67 tencent.com/vcuda-core-198 tencent.com/vcuda-core-41 tencent.com/vcuda-core-60 tencent.com/vcuda-core-1 tencent.com/vcuda-core-94 tencent.com/vcuda-core-180 tencent.com/vcuda-core-132 tencent.com/vcuda-core-126 tencent.com/vcuda-core-152 tencent.com/vcuda-core-165 tencent.com/vcuda-core-101 tencent.com/vcuda-core-169 tencent.com/vcuda-core-183 tencent.com/vcuda-core-50 tencent.com/vcuda-core-159 tencent.com/vcuda-core-19 tencent.com/vcuda-core-113 tencent.com/vcuda-core-184 tencent.com/vcuda-core-64 tencent.com/vcuda-core-56 tencent.com/vcuda-core-195 tencent.com/vcuda-core-109 tencent.com/vcuda-core-193 tencent.com/vcuda-core-71 tencent.com/vcuda-core-37 tencent.com/vcuda-core-142 tencent.com/vcuda-core-123 tencent.com/vcuda-core-122 tencent.com/vcuda-core-4 tencent.com/vcuda-core-86 tencent.com/vcuda-core-168 tencent.com/vcuda-core-59 tencent.com/vcuda-core-93 tencent.com/vcuda-core-166 tencent.com/vcuda-core-128 tencent.com/vcuda-core-145 tencent.com/vcuda-core-53 tencent.com/vcuda-core-102 tencent.com/vcuda-core-12 tencent.com/vcuda-core-173 tencent.com/vcuda-core-30 tencent.com/vcuda-core-90 tencent.com/vcuda-core-0 tencent.com/vcuda-core-117 tencent.com/vcuda-core-105 tencent.com/vcuda-core-108 tencent.com/vcuda-core-148 tencent.com/vcuda-core-172 tencent.com/vcuda-core-161],}],}, allocation failed, which is unexpected