crane-scheduler
crane-scheduler copied to clipboard
helm安装Crane-scheduler 作为第二个调度器,使用官网示例测试pod没有被调度,一直卡在”Pending“状态
helm安装Crane-scheduler 作为第二个调度器,使用官网示例测试pod没有被调度,一直卡在”Pending“状态:
1、部署yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: cpu-stress
spec:
selector:
matchLabels:
app: cpu-stress
replicas: 1
template:
metadata:
labels:
app: cpu-stress
spec:
schedulerName: crane-scheduler
hostNetwork: true
tolerations:
- key: node.kubernetes.io/network-unavailable
operator: Exists
effect: NoSchedule
containers:
- name: stress
image: docker.io/gocrane/stress:latest
command: ["stress", "-c", "1"]
resources:
requests:
memory: "1Gi"
cpu: "1"
limits:
memory: "1Gi"
cpu: "1"
2、pod详情:
Name: cpu-stress-cc8656b6c-b5hhz
Namespace: default
Priority: 0
Node:
请检查下crane-scheduler的pod的状态,是否running
kubectl get pods -n crane-system NAME READY STATUS RESTARTS AGE crane-scheduler-b84489958-6jdj6 1/1 Running 0 4d1h crane-scheduler-controller-6987688d8d-6wr7c 1/1 Running 0 4d1h 再次确认pod已经Running
kubectl get pods -n crane-system NAME READY STATUS RESTARTS AGE crane-scheduler-b84489958-6jdj6 1/1 Running 0 4d1h crane-scheduler-controller-6987688d8d-6wr7c 1/1 Running 0 4d1h 再次确认pod已经Running
从日志没看到异常。 你可以把pod的defaultScheduler改成空,试试默认调度器是否可以工作。
测试过了,默认调度器没有问题可以正常调度
测试过了,默认调度器没有问题可以正常调度
能否把完整的日志发上来,包括crane-scheduler-controller-6987688d8d-6wr7c和crane-scheduler-b84489958-6jdj6
遇到了同样的问题,使用的k8s版本为1.27 scheduler中报错如下: 0905 05:42:20.346742 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:43:01.852683 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:43:01.852729 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:43:34.262887 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:43:34.262932 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:44:33.675140 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:44:33.675182 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:45:20.214073 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:45:20.214163 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:45:56.034526 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:45:56.034592 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:46:48.730711 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:46:48.730757 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:47:24.823783 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:47:24.823828 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
请大佬帮忙指点一下吧,感谢
遇到了同样的问题,使用的k8s版本为1.27 scheduler中报错如下: 0905 05:42:20.346742 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:43:01.852683 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:43:01.852729 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:43:34.262887 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:43:34.262932 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:44:33.675140 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:44:33.675182 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:45:20.214073 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:45:20.214163 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:45:56.034526 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:45:56.034592 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:46:48.730711 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:46:48.730757 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:47:24.823783 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:47:24.823828 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
请大佬帮忙指点一下吧,感谢
应该是高版本的兼容性问题,目前1.25以下的集群没有问题,更高的集群可能要额外支持。
遇到了同样的问题,使用的k8s版本为1.27 scheduler中报错如下: 0905 05:42:20.346742 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:43:01.852683 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:43:01.852729 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:43:34.262887 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:43:34.262932 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:44:33.675140 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:44:33.675182 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:45:20.214073 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:45:20.214163 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:45:56.034526 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:45:56.034592 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:46:48.730711 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:46:48.730757 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource W0905 05:47:24.823783 1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource E0905 05:47:24.823828 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource 请大佬帮忙指点一下吧,感谢
应该是高版本的兼容性问题,目前1.25以下的集群没有问题,更高的集群可能要额外支持。
好的,感谢
我的kubernetes版本为1.20.7,使用的crane-scheduler镜像版本为0.0.20,作为第二个调度器使用。节点的annotation中已经有了聚合指标·。当我创建新的pod测试调度时,pod一直处于pending状态
crane-scheduler日志: I1018 14:19:17.775925 1 serving.go:331] Generated self-signed cert in-memory W1018 14:19:18.105223 1 options.go:330] Neither --kubeconfig nor --master was specified. Using default API client. This might not work. W1018 14:19:18.116946 1 authorization.go:47] Authorization is disabled W1018 14:19:18.116959 1 authentication.go:40] Authentication is disabled I1018 14:19:18.116979 1 deprecated_insecure_serving.go:51] Serving healthz insecurely on [::]:10251 I1018 14:19:18.119411 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController I1018 14:19:18.119430 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController I1018 14:19:18.119461 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.119469 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.119489 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.119498 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.119562 1 secure_serving.go:197] Serving securely on [::]:10259 I1018 14:19:18.119635 1 tlsconfig.go:240] Starting DynamicServingCertificateController I1018 14:19:18.219523 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.219544 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.219982 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController I1018 14:19:18.320414 1 leaderelection.go:243] attempting to acquire leader lease kube-system/kube-scheduler...
crane-scheduler-controller日志: I1018 22:16:26.114291 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (277.013756ms) I1018 22:19:25.740401 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (34.500361ms) I1018 22:19:25.764618 1 node.go:75] Finished syncing node event "kube-master-01/mem_usage_avg_5m" (24.178999ms) I1018 22:19:25.798566 1 node.go:75] Finished syncing node event "kube-node-01/mem_usage_avg_5m" (33.90647ms) I1018 22:19:25.826773 1 node.go:75] Finished syncing node event "kube-node-02/cpu_usage_avg_5m" (28.169613ms) I1018 22:19:25.848814 1 node.go:75] Finished syncing node event "kube-master-01/cpu_usage_avg_5m" (22.005738ms) I1018 22:19:26.117118 1 node.go:75] Finished syncing node event "kube-node-01/cpu_usage_avg_5m" (268.264709ms) I1018 22:22:25.737763 1 node.go:75] Finished syncing node event "kube-node-01/mem_usage_avg_5m" (32.338992ms) I1018 22:22:25.765262 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (27.45828ms) I1018 22:22:25.794327 1 node.go:75] Finished syncing node event "kube-master-01/mem_usage_avg_5m" (29.029129ms) I1018 22:22:25.818029 1 node.go:75] Finished syncing node event "kube-node-02/cpu_usage_avg_5m" (23.666818ms) I1018 22:22:25.841672 1 node.go:75] Finished syncing node event "kube-master-01/cpu_usage_avg_5m" (23.603915ms) I1018 22:22:26.125154 1 node.go:75] Finished syncing node event "kube-node-01/cpu_usage_avg_5m" (283.438566ms)
我的kubernetes版本为1.20.7,使用的crane-scheduler镜像版本为0.0.20,作为第二个调度器使用。节点的annotation中已经有了聚合指标·。当我创建新的pod测试调度时,pod一直处于pending状态
crane-scheduler日志: I1018 14:19:17.775925 1 serving.go:331] Generated self-signed cert in-memory W1018 14:19:18.105223 1 options.go:330] Neither --kubeconfig nor --master was specified. Using default API client. This might not work. W1018 14:19:18.116946 1 authorization.go:47] Authorization is disabled W1018 14:19:18.116959 1 authentication.go:40] Authentication is disabled I1018 14:19:18.116979 1 deprecated_insecure_serving.go:51] Serving healthz insecurely on [::]:10251 I1018 14:19:18.119411 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController I1018 14:19:18.119430 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController I1018 14:19:18.119461 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.119469 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.119489 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.119498 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.119562 1 secure_serving.go:197] Serving securely on [::]:10259 I1018 14:19:18.119635 1 tlsconfig.go:240] Starting DynamicServingCertificateController I1018 14:19:18.219523 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.219544 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.219982 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController I1018 14:19:18.320414 1 leaderelection.go:243] attempting to acquire leader lease kube-system/kube-scheduler...
crane-scheduler-controller日志: I1018 22:16:26.114291 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (277.013756ms) I1018 22:19:25.740401 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (34.500361ms) I1018 22:19:25.764618 1 node.go:75] Finished syncing node event "kube-master-01/mem_usage_avg_5m" (24.178999ms) I1018 22:19:25.798566 1 node.go:75] Finished syncing node event "kube-node-01/mem_usage_avg_5m" (33.90647ms) I1018 22:19:25.826773 1 node.go:75] Finished syncing node event "kube-node-02/cpu_usage_avg_5m" (28.169613ms) I1018 22:19:25.848814 1 node.go:75] Finished syncing node event "kube-master-01/cpu_usage_avg_5m" (22.005738ms) I1018 22:19:26.117118 1 node.go:75] Finished syncing node event "kube-node-01/cpu_usage_avg_5m" (268.264709ms) I1018 22:22:25.737763 1 node.go:75] Finished syncing node event "kube-node-01/mem_usage_avg_5m" (32.338992ms) I1018 22:22:25.765262 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (27.45828ms) I1018 22:22:25.794327 1 node.go:75] Finished syncing node event "kube-master-01/mem_usage_avg_5m" (29.029129ms) I1018 22:22:25.818029 1 node.go:75] Finished syncing node event "kube-node-02/cpu_usage_avg_5m" (23.666818ms) I1018 22:22:25.841672 1 node.go:75] Finished syncing node event "kube-master-01/cpu_usage_avg_5m" (23.603915ms) I1018 22:22:26.125154 1 node.go:75] Finished syncing node event "kube-node-01/cpu_usage_avg_5m" (283.438566ms)
可能是没有关闭第二调度器的leaderelection。 helm/chart中安装的scheduler关闭了leaderelection,可以参考下: https://github.com/gocrane/helm-charts/blob/main/charts/scheduler/templates/scheduler-deployment.yaml#L23
我的kubernetes版本为1.20.7,使用的crane-scheduler镜像版本为0.0.20,作为第二个调度器使用。节点的annotation中已经有了聚合指标·。当我创建新的pod测试调度时,pod一直处于pending状态 crane-scheduler日志: I1018 14:19:17.775925 1 serving.go:331] Generated self-signed cert in-memory W1018 14:19:18.105223 1 options.go:330] Neither --kubeconfig nor --master was specified. Using default API client. This might not work. W1018 14:19:18.116946 1 authorization.go:47] Authorization is disabled W1018 14:19:18.116959 1 authentication.go:40] Authentication is disabled I1018 14:19:18.116979 1 deprecated_insecure_serving.go:51] Serving healthz insecurely on [::]:10251 I1018 14:19:18.119411 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController I1018 14:19:18.119430 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController I1018 14:19:18.119461 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.119469 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.119489 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.119498 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.119562 1 secure_serving.go:197] Serving securely on [::]:10259 I1018 14:19:18.119635 1 tlsconfig.go:240] Starting DynamicServingCertificateController I1018 14:19:18.219523 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1018 14:19:18.219544 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1018 14:19:18.219982 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController I1018 14:19:18.320414 1 leaderelection.go:243] attempting to acquire leader lease kube-system/kube-scheduler... crane-scheduler-controller日志: I1018 22:16:26.114291 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (277.013756ms) I1018 22:19:25.740401 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (34.500361ms) I1018 22:19:25.764618 1 node.go:75] Finished syncing node event "kube-master-01/mem_usage_avg_5m" (24.178999ms) I1018 22:19:25.798566 1 node.go:75] Finished syncing node event "kube-node-01/mem_usage_avg_5m" (33.90647ms) I1018 22:19:25.826773 1 node.go:75] Finished syncing node event "kube-node-02/cpu_usage_avg_5m" (28.169613ms) I1018 22:19:25.848814 1 node.go:75] Finished syncing node event "kube-master-01/cpu_usage_avg_5m" (22.005738ms) I1018 22:19:26.117118 1 node.go:75] Finished syncing node event "kube-node-01/cpu_usage_avg_5m" (268.264709ms) I1018 22:22:25.737763 1 node.go:75] Finished syncing node event "kube-node-01/mem_usage_avg_5m" (32.338992ms) I1018 22:22:25.765262 1 node.go:75] Finished syncing node event "kube-node-02/mem_usage_avg_5m" (27.45828ms) I1018 22:22:25.794327 1 node.go:75] Finished syncing node event "kube-master-01/mem_usage_avg_5m" (29.029129ms) I1018 22:22:25.818029 1 node.go:75] Finished syncing node event "kube-node-02/cpu_usage_avg_5m" (23.666818ms) I1018 22:22:25.841672 1 node.go:75] Finished syncing node event "kube-master-01/cpu_usage_avg_5m" (23.603915ms) I1018 22:22:26.125154 1 node.go:75] Finished syncing node event "kube-node-01/cpu_usage_avg_5m" (283.438566ms)
可能是没有关闭第二调度器的leaderelection。 helm/chart中安装的scheduler关闭了leaderelection,可以参考下: https://github.com/gocrane/helm-charts/blob/main/charts/scheduler/templates/scheduler-deployment.yaml#L23
确实是第二调度器没有关闭leaderelection导致的。但不是scheduler-deployment.yaml中的leaderelection,是scheduler-configmap.yaml中的leaderelection没关闭
我的kubernetes版本为1.20.7,使用的crane-scheduler镜像版本为0.0.20,作为第二个调度器使用。节点的annotation中已经有了聚合指标·。当我创建新的pod测试调度时,pod一直处于pending状态
可能是没有关闭第二调度器的leaderelection。 helm/chart中安装的scheduler关闭了leaderelection,可以参考下: https://github.com/gocrane/helm-charts/blob/main/charts/scheduler/templates/scheduler-deployment.yaml#L23
确实是第二调度器没有关闭leaderelection导致的。但不是scheduler-deployment.yaml中的leaderelection,是scheduler-configmap.yaml中的leaderelection没关闭
我的kubernetes版本为1.22.12,使用的crane-scheduler镜像版本为scheduler-0.2.2,作为第二个调度器使用。节点的annotation中已经有了聚合指标·。也将leaderelection改为false了,但是当我创建新的pod测试调度时,pod一直处于pending状态。 pod信息:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 15s crane-scheduler 0/1 nodes are available: 1 Insufficient cpu.
leaderelection:
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
scheduler-config.yaml: |
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
leaderElection:
leaderElect: false
profiles:
- schedulerName: crane-scheduler
plugins:
filter:
enabled:
- name: Dynamic
score:
enabled:
- name: Dynamic
weight: 3
crane-scheduler日志:
I1226 09:47:56.595597 1 serving.go:348] Generated self-signed cert in-memory
W1226 09:47:57.035592 1 client_config.go:617] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I1226 09:47:57.041561 1 server.go:139] "Starting Kubernetes Scheduler" version="v0.0.0-master+$Format:%H$"
I1226 09:47:57.044642 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I1226 09:47:57.044658 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I1226 09:47:57.044666 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I1226 09:47:57.044679 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1226 09:47:57.044699 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I1226 09:47:57.044715 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1226 09:47:57.045160 1 secure_serving.go:200] Serving securely on [::]:10259
I1226 09:47:57.045218 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I1226 09:47:57.145093 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1226 09:47:57.145152 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
I1226 09:47:57.145100 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
crane-scheduler-controller日志:
root@master:/home/ubuntu/kube-prometheus/manifests# kubectl logs -n crane-system crane-scheduler-controller-6f6b94c8f7-79vff
I1226 17:47:56.187263 1 server.go:61] Starting Controller version v0.0.0-master+$Format:%H$
I1226 17:47:56.188316 1 leaderelection.go:248] attempting to acquire leader lease crane-system/crane-scheduler-controller...
I1226 17:48:12.646241 1 leaderelection.go:258] successfully acquired lease crane-system/crane-scheduler-controller
I1226 17:48:12.747072 1 controller.go:72] Caches are synced for controller
I1226 17:48:12.747174 1 node.go:46] Start to reconcile node events
I1226 17:48:12.747208 1 event.go:30] Start to reconcile EVENT events
I1226 17:48:12.773420 1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (26.154965ms)
I1226 17:48:12.794854 1 node.go:75] Finished syncing node event "master/cpu_usage_max_avg_1h" (21.278461ms)
I1226 17:48:12.818035 1 node.go:75] Finished syncing node event "master/cpu_usage_max_avg_1d" (23.146517ms)
I1226 17:48:12.837222 1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (19.151134ms)
I1226 17:48:13.055018 1 node.go:75] Finished syncing node event "master/mem_usage_max_avg_1h" (217.762678ms)
I1226 17:48:13.455442 1 node.go:75] Finished syncing node event "master/mem_usage_max_avg_1d" (400.366453ms)
I1226 17:51:12.788539 1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (41.092765ms)
I1226 17:51:12.810824 1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (22.248821ms)
I1226 17:54:12.771140 1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (22.840662ms)
I1226 17:54:12.789918 1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (18.740179ms)
I1226 17:57:12.773735 1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (26.395777ms)
I1226 17:57:12.792897 1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (19.124323ms)
I1226 18:00:12.772243 1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (24.369461ms)
I1226 18:00:12.804297 1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (32.008004ms)
I1226 18:03:12.774690 1 node.go:75] Finished syncing node event "master/mem_usage_max_avg_1h" (27.291591ms)
I1226 18:03:12.795145 1 node.go:75] Finished syncing node event "master/mem_usage_avg_5m" (20.350165ms)
I1226 18:03:12.813508 1 node.go:75] Finished syncing node event "master/cpu_usage_avg_5m" (18.32638ms)
I1226 18:03:12.833109 1 node.go:75] Finished syncing node event "master/cpu_usage_max_avg_1h" (19.549029ms)