scheduler-plugins TargetLoadPacking initializing failed: invalid configuration: no configuration has been provided

Kubernetes 1.23.7 with 3 master nodes, Scheduler Version : 0.23.10

Step 1: install scheduler-plugins-controller kubectl apply -f all-in-one.yaml

Step 2: provide /etc/kubernetes/scheduler-plugin-config.yaml on one master node

apiVersion: kubescheduler.config.k8s.io/v1beta3
kind: KubeSchedulerConfiguration
clientConnection:
  kubeconfig: "/etc/kubernetes/scheduler.conf"
leaderElection:
  leaseDuration: 15s
  renewDeadline: 10s
profiles:
- schedulerName: trimaran
  plugins:
    score:
      disabled:
      - name: NodeResourcesBalancedAllocation
      - name: NodeResourcesLeastAllocated
      enabled:
      - name: TargetLoadPacking
  pluginConfig:
  - name: TargetLoadPacking
    args:
      defaultRequests:
        cpu: "1000m"
      defaultRequestsMultiplier: "1"
      targetUtilization: 70
      metricProvider: 
        type: KubernetesMetricsServer

Step 3: modfiy /etc/kubernetes/manifests/kube-scheduler.yaml on the master node above

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-scheduler
    tier: control-plane
  name: kube-scheduler
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-scheduler
    - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    - --bind-address=0.0.0.0
    - --config=/etc/kubernetes/scheduler-plugin-config.yaml
    - --kubeconfig=/etc/kubernetes/scheduler.conf
    - --leader-elect=true
    image: k8s.gcr.io/scheduler-plugins/kube-scheduler:v0.23.10
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    name: kube-scheduler
    resources:
      requests:
        cpu: 100m
    startupProbe:
      failureThreshold: 30
      httpGet:
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    volumeMounts:
    - mountPath: /etc/kubernetes/scheduler.conf
      name: kubeconfig
      readOnly: true
    - mountPath: /etc/kubernetes/scheduler-plugin-config.yaml
      name: kubescheduler-config
      readOnly: true
  hostNetwork: true
  priorityClassName: system-node-critical
  securityContext:
    seccompProfile:
      type: RuntimeDefault
  volumes:
  - hostPath:
      path: /etc/kubernetes/scheduler.conf
      type: FileOrCreate
    name: kubeconfig
  - hostPath:
      path: /etc/kubernetes/scheduler-plugin-config.yaml
      type: File
    name: kubescheduler-config
status: {}

the kube-scheduler on the node above can't startup, the following is the pod's log

# kubectl logs kube-scheduler-nodemaster3 -n kube-system -f
I0913 05:48:42.638186       1 serving.go:348] Generated self-signed cert in-memory
W0913 05:48:42.854101       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
W0913 05:48:42.854197       1 client_config.go:622] error creating inClusterConfig, falling back to default config: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
E0913 05:48:42.854344       1 run.go:74] "command failed" err="couldn't create scheduler: initializing profiles: creating profile for scheduler name trimaran: initializing plugin \"TargetLoadPacking\": invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable"

but if i don't enable the TargetLoadPacking , the kube-scheduler pod can start up /etc/kubernetes/scheduler-plugin-config.yaml

apiVersion: kubescheduler.config.k8s.io/v1beta3
kind: KubeSchedulerConfiguration
clientConnection:
  kubeconfig: "/etc/kubernetes/scheduler.conf"
leaderElection:
  leaseDuration: 15s
  renewDeadline: 10s
profiles:
- schedulerName: trimaran
  plugins:
    score:
      disabled:
      - name: NodeResourcesBalancedAllocation
      - name: NodeResourcesLeastAllocated
      #enabled:
      # - name: TargetLoadPacking
  pluginConfig:
  - name: TargetLoadPacking
    args:
      defaultRequests:
        cpu: "1000m"
      defaultRequestsMultiplier: "1"
      targetUtilization: 70
      metricProvider: 
        type: KubernetesMetricsServer

the following is the pod's log:

# kubectl logs kube-scheduler-nodemaster3 -n kube-system -f
I0913 05:45:07.963437       1 serving.go:348] Generated self-signed cert in-memory
I0913 05:45:08.506945       1 server.go:139] "Starting Kubernetes Scheduler" version="v0.23.10"
I0913 05:45:08.511780       1 secure_serving.go:200] Serving securely on [::]:10259
I0913 05:45:08.511910       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0913 05:45:08.511920       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0913 05:45:08.511943       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I0913 05:45:08.514156       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0913 05:45:08.514174       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0913 05:45:08.514188       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0913 05:45:08.514192       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file

Sep 13 '22 06:09 ccwingcode

I think you are missing spaces/tab in the file /etc/kubernetes/scheduler-plugin-config.yaml under profiles key. Here is mine for comparison:

apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
leaderElection:
  leaderElect: false
profiles:
  - schedulerName: trimaran
    plugins:
      score:
        disabled:
          - name: NodeResourcesBalancedAllocation
          - name: NodeResourcesLeastAllocated
        enabled:
          - name: TargetLoadPacking
    pluginConfig:
      - name: TargetLoadPacking
        args:
          defaultRequests:
            cpu: "500m"
          defaultRequestsMultiplier: "1"
          targetUtilization: 30
          metricProvider:
            insecureSkipVerify: true

Look at: - schedulerName: trimaran indentation.

Or compare to the official Kubernetes docs.

Oct 07 '22 12:10 wintersolutions

@wintersolutions, thanks for posting the solution.

Dec 25 '22 01:12 wangchen615

@ccwingcode could you please verify if @wintersolution 's solution work?

Dec 25 '22 02:12 wangchen615

Seems like problem in load-watcher (trimaran plugins calls it during initialization): it can't access k8s api server in such configuration.

Workaround: add KUBE_CONFIG var in scheduler static pod manifest with value /etc/kubernetes/scheduler.conf (k8s api access config for scheduler). For this to work, RBAC must be configured to allow scheduler to access metrics.

Solution: configure watcheAddress in pluginConfig

Feb 01 '23 11:02 evgkrsk

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

May 02 '23 11:05 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Jun 01 '23 12:06 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Jul 01 '23 12:07 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Jul 01 '23 12:07 k8s-ci-robot