gpushare-scheduler-extender
gpushare-scheduler-extender copied to clipboard
使用aliyun.com/gpu-mem资源调度Pod失败 http://127.0.0.1:32766/gpushare-scheduler/filter context deadline exceeded
kubectl describe pod $pod
Events: Type Reason Age From Message
Warning FailedScheduling 64s Post "http://127.0.0.1:32766/gpushare-scheduler/filter": context deadline exceeded (Client.Timeout exceeded while awaiting headers) Warning FailedScheduling 58s Post "http://127.0.0.1:32766/gpushare-scheduler/filter": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Same too. Are there any solutions ..?
I found my problems. coredns deployed on node which scheduler deployed was not running well. After I deployed coredns well, This problem resolved.
现在有解决办法了吗?我部署完成之后调度也提示无法调度 Post "http://127.0.0.1:32766/gpushare-scheduler/filter": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
找到我自己的解决办法了,是他的scheduler-policy-config.yaml里的extenders.urlPrefix的问题,我把urlPrefix: "http://127.0.0.1:32766/gpushare-scheduler"
修改为 urlPrefix: "http://<gpushare-schd-extender-svc-clusterip>:32766/gpushare-scheduler"
就可以了
找到我自己的解决办法了,是他的scheduler-policy-config.yaml里的extenders.urlPrefix的问题,我把urlPrefix:
"http://127.0.0.1:32766/gpushare-scheduler"
修改为 urlPrefix:"http://<gpushare-schd-extender-svc-clusterip>:32766/gpushare-scheduler"
就可以了
找到我自己的解决办法了,是他的scheduler-policy-config.yaml里的extenders.urlPrefix的问题,我把urlPrefix:
"http://127.0.0.1:32766/gpushare-scheduler"
修改为 urlPrefix:"http://<gpushare-schd-extender-svc-clusterip>:32766/gpushare-scheduler"
就可以了你好,请问你说的这个指的是services的ip吗,还是说直接填这段字符
是节点的ip,指向gpushare-schd-extender对应的nodeport,