volcano icon indicating copy to clipboard operation
volcano copied to clipboard

podgroup not created if OwnerReferencesPermissionEnforcement enabled in Kubernetes

Open Robert-Christensen-visa opened this issue 2 years ago • 6 comments

I am trying to have volcano schedule independent pods so I can have these pods use a specific queue. I am doing this so volcano can keep track of resources used by the cluster.

A statefulset is created that creates a pod that will be scheduled by Volcano. The Volcano controller is not able to create a podgroup for the pods created by the statefulset, so the pod are never scheduled. I see the following lines in the logs for vc-controller:

E0216 19:35:03.787545       1 pg_controller_handler.go:119] Failed to create normal PodGroup for Pod <default/pod-owned-by-controller-0>: podgroups.scheduling.volcano.sh "podgroup-a3be5644-02aa-4301-a42a-f2be30001102" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
E0216 19:35:03.788306       1 pg_controller.go:134] Failed to handle Pod <default/pod-owned-by-controller-0>: podgroups.scheduling.volcano.sh "podgroup-a3be5644-02aa-4301-a42a-f2be30001102" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>

In the code used to generate the podgroup, it selects the owner of the pod, or it will select the pod itself if the pod has no owner. https://github.com/volcano-sh/volcano/blob/eb1fc7f8f102745282718847ffcdd84e28b661f0/pkg/controllers/podgroup/pg_controller_handler.go#L128-L135

If OwnerReferencesPermissionEnforcement is enabled in the cluster it requires update permission on the finalizer of the resource to set metadata.ownerReferences.blockOwnerDeletion. Volcano controller does not have these permissions.

When the podgroup is created it is setting metadata.ownerReferences.blockOwnerDeletion. The controller is setting the owner of the podgroup to be the parent of the pod. If volcano is to be used to schedule arbitrary pods that are created by arbitrary resources, such as a replicaset or statefulset, volcano must be given update permissions on the finalizer of those resources. Alternatively, volcano can set the owner of the podgroup to be the pod itself.

What you expected to happen:

I expect volcano to be able to always schedule an arbitrary pod in the cluster, regardless of how the pod was created or what resource owns the pod being scheduled.

How to reproduce it (as minimally and precisely as possible):

The cluster must have RBAC enforced and OwnerReferencesPermissionEnforcement must be enabled in the cluster.

create a resource that creates a pod, such as a replicaset:

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: test-set
  labels:
    project-name: test-replica-set
spec:
  replicas: 1
  selector:
    matchLabels:
      project-name: test-replica-set
  template:
    metadata:
      labels:
        project-name: test-replica-set
    spec:
      schedulerName: volcano
      containers:
      - name: main
        image: ubuntu:latest
        command: ["sh", "-c", "sleep 6000"]
        resources:
          limits:
            cpu: "1000m"
            memory: "256Mi" 

A podgroup will not be created for the pod being created, so the pod will not be scheduled using volcano.

Anything else we need to know?:

Environment:

  • Volcano Version: v1.5.0-Beta
  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

Robert-Christensen-visa avatar Feb 16 '22 20:02 Robert-Christensen-visa

Is there any good solution? It seems the only way is to add "/finalizers" update permission on all the basic resource.

shinytang6 avatar Feb 17 '22 07:02 shinytang6

I am not sure what the solution is. I hope a discussion will help identify possible solutions. Adding /finalizers to all basic resources would make it possible for volcano to be used as the scheduler for those resources. If custom resources are added to the cluster and volcano is used the schedule to pods owned by the CRD, it would require somebody to add permissions to the volcano controller for each new CRD.

Robert-Christensen-visa avatar Feb 17 '22 15:02 Robert-Christensen-visa

@Robert-Christensen-visa Totally agree with you. We need consider crd like spark-operator/flink-operator with all basic resource as well. Volcano is supposed to support arbitrary pod scheduling in the cluster.

william-wang avatar Feb 18 '22 01:02 william-wang

Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

stale[bot] avatar May 31 '22 03:05 stale[bot]

This is still an issue.

Robert-Christensen-visa avatar May 31 '22 14:05 Robert-Christensen-visa

Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

stale[bot] avatar Sep 08 '22 22:09 stale[bot]