volcano
volcano copied to clipboard
Deleting a queue with `kubectl` fails with "only queue with state `Closed` can be deleted"
What happened:
An attempt to delete a queue with kubectl delete
ends with the following error:
Error from server: error when deleting "ex2-volcano.yaml": admission webhook "validatequeue.volcano.sh" denied the request: only queue with state `Closed` can be deleted, queue `one-q` state is `Open`
Error from server (NotFound): error when deleting "ex2-volcano.yaml": jobs.batch.volcano.sh "j1-vj" not found
What you expected to happen:
I expect queue be deleted.
How to reproduce it (as minimally and precisely as possible):
Just create a new queue and try to delete it with kubectl delete
.
Anything else we need to know?:
I realize I can close the queue first with vcctl
, but I need a pure Kubernetes solution without extra CLI tools. I am going to eventually create/delete queues through Kubernetes API so I cannot rely on some external CLI utilities. I tried to edit a queue with kubectl edit
but unfortunately could not close it this way.
Environment:
- Volcano Version: 1.3.0
- Kubernetes version (use
kubectl version
):Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.0", GitCommit:"c2b5237ccd9c0f1d600d3072634ca66cefdf272f", GitTreeState:"clean", BuildDate:"2021-08-04T17:56:19Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:53:14Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
As the design of queue
, only queues with closed
status, which means it will no longer receive new coming jobs, can be deleted. queue
is a concept inheriting from traditional HPC scheduling architectures and widely used for cluster resource sharing and division among departments or groups. As to default scheduler of Kubernets, it's a scheduling for common scenarios. So it contains no queue
.
@Thor-wl , thank you, that is clear. The issues is that I do not know how to close a queue with kubectl
without using vcctl
. I assume if I can do it with kubectl
, then I will be able to close a queue through the Kubernetes API.
@Thor-wl , thank you, that is clear. The issues is that I do not know how to close a queue with
kubectl
without usingvcctl
. I assume if I can do it withkubectl
, then I will be able to close a queue through the Kubernetes API.
Users can only close queue with vcctl
now. Queue
is not a native concept for Kubernetes. So there's no way to use K8s CLI to do that.
@Thor-wl, understood. Can I expect that there will be a possibility to close a queue with kubectl edit
? I realize that queue is not Kubernetes native concept. But I believe it is added a s custom resource (through CRD) so kubectl edit
can modify it. I thought it is a Kubernetes way of doing things. User modifies Kubernetes object (e.g., through kubectl edit
) and operator responsible for this object does it best to move it closer to the desired state.
@prokher Yes. Perhaps it's hard to push this idea directly to Kubernetes community, but it is possible to develop a CLI plugin to support that.
Actually you can apply a Command
crd provided by volcano to do that, vcctl is doing this way. But l believe the community has very little description of this function currently.
@Thor-wl, honestly, I am not getting why you are speaking about pushing ideas to the Kubernetes community. Although it could be true, I was talking about a completely different thing.
I had in mind the possibility to edit a queue instance with kubectl edit
and modify its YAML so that it becomes closed eventually. In such case, I could delete it with kubectl delete
then. For example, there could be a property enabled
switching which to false
causes a queue to close.
@shinytang6, applying a Command CRD with kubectl
may work as a workaround. Would you mind sharing a place I can find an example?
@Thor-wl, honestly, I am not getting why you are speaking about pushing ideas to the Kubernetes community. Although it could be true, I was talking about a completely different thing.
I had in mind the possibility to edit a queue instance with
kubectl edit
and modify its YAML so that it becomes closed eventually. In such case, I could delete it withkubectl delete
then. For example, there could be a propertyenabled
switching which tofalse
causes a queue to close.@shinytang6, applying a Command CRD with
kubectl
may work as a workaround. Would you mind sharing a place I can find an example?
@prokher There are no relevant examples in the community right now, l can help add some later. You can try with something like that:
apiVersion: bus.volcano.sh/v1alpha1
kind: Command
metadata:
name: close-queue
action: CloseQueue
target:
apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
name: xxx
uid: xxx
@shinytang6 Thank you for the recipe, this indeed worked! Now I can create a queue, submit jobs and delete the queue through the kubectl
invocation.
I would anyway recommend to add a field to close a queue by kubectl edit <queue>
. So I keep the ticket opened, it is up to maintainers to decide whether to close it or to keep for the case of implementing this.
Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).
The issues is still here AFAIK.
The issues is still here AFAIK.
Emm, so what's the target solution/behaviour?
I would expect I can delete a queue through kubectl delete
. If it is necessary to close it first — OK, I would expect to use kubectl edit
for this and then delete it with kubectl delete
.
I would expect I can delete a queue through
kubectl delete
. If it is necessary to close it first — OK, I would expect to usekubectl edit
for this and then delete it withkubectl delete
.
If so, let's have a kubectl plugin for volcano :)
Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).
It is still needed
It is still needed
I agree with that. It will be more friendly for users to operate the objects.
/assign @hwdef
What if we rename vcctl
to kubectl-volcano
and use it as a plugin for kubectl
we can close queue by
kubectl volcano queue operate -a close -n test
then, delete the queue
kubectl volcano queue delete -n test
If this is cumbersome, we can optimize the command
kubectl volcano close queue --name test
kubectl volcano delete queue --name test
@prokher @k82cn @Thor-wl
What if we rename
vcctl
tokubectl-volcano
and use it as a plugin forkubectl
we can close queue by
kubectl volcano queue operate -a close -n test
then, delete the queue
kubectl volcano queue delete -n test
If this is cumbersome, we can optimize the command
kubectl volcano close queue --name test kubectl volcano delete queue --name test
Perhaps we can take different designs to the weekly meeting this week. It is an interesting issue. Let's get more voice from the community.
In our scenario, we will operate Volcano from the node which does not have vcctl
installed. Moreover, some parts of the system will work through the Kubernetes REST API, which also knows nothing about vcctl
. I already shared my considerations above:
I would expect I can delete a queue through kubectl delete. If it is necessary to close it first — OK, I would expect to use
kubectl edit
and then delete it with kubectl delete.
Kubernetes seems to follow the declarative approach of describing its objects/resources. IMHO, to be declarative, the queue should have a boolean property open
, which client can switch to false
with regular kubectl edit
.
@prokher I see. You are more expecting to manage the queue only through kubectl. We'll discuss this at our community meeting.
Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).
I still need this.
The current situation is that the queue can be deleted even if it is not closed, but the vcjobs in the queue continue to exist, we need to improve this and I have two options in mind.
- cascade the deletion of vcjobs when the queue is deleted, regardless of the state of the vcjobs under this queue, analogous to namespace
- set a new field for the queue, such as
spce.isOpen
, by modifying this field, change the open and closed state of the queue, while restoring the previous webhook logic, the queue is not allowed to be deleted if it is not closed
What do you think? @prokher @Thor-wl @k82cn @william-wang @shinytang6
@hwdef, IMHO seems reasonable, thank you for update.
@hwdef, IMHO seems reasonable, thank you for update.
Which of these two options do you prefer?
Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).
@hwdef, IMHO seems reasonable, thank you for update.
Which of these two options do you prefer?
Actually any of both a quite OK from my point of view.