add support for volumeBinding plugin
This submission is based on a previous PR(#2506), and fixes the issue where the volumeBinding plugin did not correctly filter nodes when performing predicates on multiple tasks of a job.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by:
To complete the pull request process, please assign qiankunli
You can assign the PR to them by writing /assign @qiankunli in a comment when ready.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/assign @qiankunli @k82cn @william-wang
@chxk Thanks for the pr :) Would you log a issue firstly, and add provide the following information? What happened: What you expected to happen: How to reproduce it (as minimally and precisely as possible):
@chxk Thanks for the pr :) Would you log a issue firstly, and add provide the following information? What happened: What you expected to happen: How to reproduce it (as minimally and precisely as possible):
ok @william-wang
What happened:
Currently Volcano can't filter nodes by PV, which can't meet our requirements. We found a PR(#2506) which added this feature support (based on the volumeBinding plugin) before, but it was reverted.
After investigation, we found the problem introduced by this PR:
When allocating nodes for each task in a job, it uses the ssn.cache.AllocateVolumes, which based on an already initialized SchedulerVolumeBinder, and will update its pvCache and pvcCache. However, every time a task is predicated, a new volumeBinding plugin is initialized, and the empty pvCache and pvcCache are used for Filter, instead of which in the initialized SchedulerVolumeBinder. This can potentially cause a PV to be bound to multiple different PVCs.
What you expected to happen: Add volumeBinding feature to Volcano and it should work normally. After our changes were merged and deployed, we have not observed the memory leak issue that was previously mentioned in #2555。
How to reproduce it (as minimally and precisely as possible): based on a version with PR(#2506):
- Create a local PV with storageClass
fast-disks. - Create a job with two pods, each requesting a different PVC(storageClass is both
fast-disks) and using a nodeSelector pointing to the node of the local PV. The yaml file is as following
apiVersion: scheduling.volcano.sh/v1beta1
kind: PodGroup
metadata:
name: test-binding
namespace: default
spec:
minMember: 2
queue: default
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-binding-1
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: fast-disks
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-binding-2
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: fast-disks
---
apiVersion: v1
kind: Pod
metadata:
name: test-binding-1
namespace: default
annotations:
scheduling.k8s.io/group-name: test-binding
spec:
schedulerName: volcano
nodeSelector:
testpvc: owner
containers:
- name: test
image: <your image>
resources:
limits:
cpu: 3
command:
- /bin/bash
- -c
- sleep 1d
volumeMounts:
- name: hahaha
mountPath: /fast-disks
volumes:
- name: hahaha
persistentVolumeClaim:
claimName: test-binding-1
---
apiVersion: v1
kind: Pod
metadata:
name: test-binding-2
namespace: default
annotations:
scheduling.k8s.io/group-name: test-binding
spec:
schedulerName: volcano
nodeSelector:
testpvc: owner
containers:
- name: test
image: <your image>
resources:
limits:
cpu: 3
command:
- /bin/bash
- -c
- sleep 1d
volumeMounts:
- name: hahaha
mountPath: /fast-disks
volumes:
- name: hahaha
persistentVolumeClaim:
claimName: test-binding-2
- Observe the result: the job gets scheduled, but only one pod's PVC gets bound to the local PV.
lgtm
I don't know much about this area. I think the code is ok.
Thank you for review. Could you help review this pr please? @Thor-wl @william-wang
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
/reopen
@lowang-bh: Reopened this PR.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
i met same problem, and i think this PR can solve it greatly
lgtm
@chxk: PR needs rebase.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.