crane-scheduler
crane-scheduler copied to clipboard
binding rejected: running Bind plugin "DefaultBinder": Operation cannot be fulfilled on pods/binding
调度失败
I1121 03:06:27.345666 1 plugins.go:92] [crane] Node[dev-monitoring]'s finalscore is 69, while score is 69 and hotvalue is 0.000000
I1121 03:06:27.345752 1 plugins.go:92] [crane] Node[dev-qchen]'s finalscore is 81, while score is 81 and hotvalue is 0.000000
I1121 03:06:27.345751 1 plugins.go:92] [crane] Node[bqdev02]'s finalscore is 72, while score is 72 and hotvalue is 0.000000
I1121 03:06:27.345775 1 plugins.go:92] [crane] Node[bqdev01]'s finalscore is 85, while score is 85 and hotvalue is 0.000000
I1121 03:06:27.345780 1 plugins.go:92] [crane] Node[bqdev03]'s finalscore is 74, while score is 74 and hotvalue is 0.000000
I1121 03:06:27.345787 1 plugins.go:92] [crane] Node[dev-node4]'s finalscore is 67, while score is 67 and hotvalue is 0.000000
I1121 03:06:27.345797 1 plugins.go:92] [crane] Node[dev-xyli]'s finalscore is 67, while score is 67 and hotvalue is 0.000000
I1121 03:06:27.345790 1 plugins.go:92] [crane] Node[dev-master3]'s finalscore is 79, while score is 79 and hotvalue is 0.000000
I1121 03:06:27.345810 1 plugins.go:92] [crane] Node[dev-node5]'s finalscore is 83, while score is 83 and hotvalue is 0.000000
I1121 03:06:27.345821 1 plugins.go:92] [crane] Node[dev-whliao]'s finalscore is 73, while score is 73 and hotvalue is 0.000000
E1121 03:06:27.358217 1 framework.go:1000] "Failed running Bind plugin" err="Operation cannot be fulfilled on pods/binding \"cpu-stress-59f8597545-7bdrq\": pod cpu-stress-59f8597545-7bdrq is already assigned to node \"dev-node5\"" plugin="DefaultBinder" pod="crane-system/cpu-stress-59f8597545-7bdrq"
E1121 03:06:27.358235 1 scheduler.go:610] "scheduler cache ForgetPod failed" err="pod c2cae006-2ae2-4ca6-b2f6-6af43faaa972 wasn't assumed so cannot be forgotten"
E1121 03:06:27.358250 1 factory.go:225] "Error scheduling pod; retrying" err="binding rejected: running Bind plugin \"DefaultBinder\": Operation cannot be fulfilled on pods/binding \"cpu-stress-59f8597545-7bdrq\": pod cpu-stress-59f8597545-7bdrq is already assigned to node \"dev-node5\"" pod="crane-system/cpu-stress-59f8597545-7bdrq"
I1121 03:06:27.358258 1 factory.go:238] "Pod has been assigned to node. Abort adding it back to queue." pod="crane-system/cpu-stress-59f8597545-7bdrq" node="dev-node5"
遇到调度失败的问题,有大佬能给点排查思路吗
环境信息:
# helm list -n crane-system
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
scheduler crane-system 3 2023-11-21 10:12:59.63159177 +0800 CST deployed scheduler-0.2.2 0.2.2
# helm get values -n crane-system scheduler
USER-SUPPLIED VALUES:
controller:
enable: true
image:
repository: dockerhub.bigquant.ai:5000/aipaas-devops/3rdparty/docker.io/gocrane/crane-scheduler-controller
tag: 0.0.24
name: crane-scheduler-controller
replicaCount: 3
global:
prometheusAddr: http://kube-prometheus-kube-prome-prometheus.monitoring.svc.cluster.local:9090
scheduler:
enable: true
image:
repository: dockerhub.bigquant.ai:5000/aipaas-devops/3rdparty/docker.io/gocrane/crane-scheduler
tag: 0.0.23
name: crane-scheduler
replicaCount: 3
# kubectl version
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.10", GitCommit:"e770bdbb87cccdc2daa790ecd69f40cf4df3cc9d", GitTreeState:"clean", BuildDate:"2023-05-17T14:12:20Z", GoVersion:"go1.19.9", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.10", GitCommit:"e770bdbb87cccdc2daa790ecd69f40cf4df3cc9d", GitTreeState:"clean", BuildDate:"2023-05-17T14:06:35Z", GoVersion:"go1.19.9", Compiler:"gc", Platform:"linux/amd64"}
sheduler 不支持多副本
same issue:
https://github.com/gocrane/crane-scheduler/issues/28
若为替换方式安装,/etc/kubernetes/manifests/kube-scheduler.yaml
中设置 scheduler 命令行参数:--leader-elect=true
,参考 https://github.com/gocrane/crane-scheduler/blob/c2c05338a5d75c0a6d92bd16a1cf257b48b30ef8/deploy/scheduler/deployment.yaml#L33