gpushare-scheduler-extender icon indicating copy to clipboard operation
gpushare-scheduler-extender copied to clipboard

Adapting for use with managed control plane

Open tlives opened this issue 6 years ago • 6 comments

I have an EKS cluster and am hoping to adapt this to run as a second scheduler since I can't edit the default kube-scheduler as called for in your installation instructions (I don't believe, but correct me if I am wrong).

I have edited the yaml slightly to be in line with the guide at the below link: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/

But it doesn't seem to be working (will admit I knew this was wishful thinking). Any ideas what else I need to do? I am very new to go so struggling to dig into the source code.

kind: ServiceAccount
apiVersion: v1
metadata:
  name: gpu-scheduler
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: gpu-scheduler-as-kube-scheduler
subjects:
- kind: ServiceAccount
  name: gpu-scheduler
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: system:kube-scheduler
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: apps/v1 #extensions/v1beta1
kind: Deployment
metadata:
  labels:
    component: scheduler
    tier: control-plane
  name: gpu-scheduler
  namespace: kube-system
spec:
  selector:
    matchLabels:
      component: scheduler
      tier: control-plane
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        component: scheduler
        tier: control-plane
    spec:
      serviceAccountName: gpu-scheduler
      containers:
      - image: registry.cn-hangzhou.aliyuncs.com/acs/k8s-gpu-scheduler:1.11-d170d8a
        name: gpu-scheduler
        env:
        - name: LOG_LEVEL
          value: debug
        - name: PORT
          value: "12345"
      hostNetwork: true
      tolerations:
      - effect: NoSchedule
        operator: Exists
        key: node-role.kubernetes.io/master
      - effect: NoSchedule
        operator: Exists
        key: node.cloudprovider.kubernetes.io/uninitialized
      nodeSelector:
         node-role.kubernetes.io/master: ""

tlives avatar Aug 16 '19 09:08 tlives

I need to implement a full second scheduler don't I...

tlives avatar Aug 16 '19 15:08 tlives

I think it can work only when Kubernetes default scheduler can be configured.

cheyang avatar Aug 28 '19 09:08 cheyang

@cheyang , why?

@tlives , any success here?

ide8 avatar Oct 16 '19 17:10 ide8

@ide8 afraid not, we decided on a different setup. I did see this but haven't tried it: https://github.com/Deepomatic/shared-gpu-nvidia-k8s-device-plugin

The reason it won't work is that in a managed service you don't have access to the scheduler config to modify it (see installation instructions, this is a requirement).

tlives avatar Oct 17 '19 15:10 tlives

Any updates on this? I was hoping by running a second scheduler we could simply call out to the second scheduler and apply extenders to it.

Califax avatar Apr 29 '20 00:04 Califax

@tlives any updates from your end on this?

pen-pal avatar Mar 02 '21 13:03 pen-pal