Bug in volcano scheduler metrics for queue pod group count
What happened:
When there is only one job (pod group) running on a queue, and if that job is deleted the metrics does not get updated.
That said, queue_pod_group_running_count is 1 when a job is running, and if that job is deleted for some reason, it still remains 1, not updated to 0. Same for other metrics like queue_pod_group_pending_count..
What you expected to happen:
I expect the queue_pod_group_running_count turn to 0, since the running job has been deleted.
How to reproduce it (as minimally and precisely as possible):
Remove all the existing volcano job or pod groups if any, then follow the instructions below:
- Try creating a VolcanoJob that sleeps infinitely -> Check if the
queue_pod_group_running_counthas been updated to 1. - Try deleting that VolcanoJob -> Check the
queue_pod_group_running_countif it is still 1.
Environment:
- Volcano Version: v1.8.0
Seems the metrics update is not triggered when there is zero job in here. Perhaps we have to add a code to set default value zero for all the queue-related metrics and then trigger the metrics update if there is any job.
Seems the metrics update is not triggered when there is zero job
Yeah,current code can not trigger metrics update if no jobs in queue.
Seems the metrics update is not triggered when there is zero job
Yeah,current code can not trigger metrics update if no jobs in queue.
Maybe we can move these codes to OnSessionClose to cover all queues' metrics.
https://github.com/volcano-sh/volcano/blob/67cabf78a6a50751287eecedd8e050c5977ebb40/pkg/scheduler/plugins/proportion/proportion.go#L168-L178
It will still has same problems, because those queues who has no jobs in them will not be iterated using for _, attr := range pp.queueOpts
It will still has same problems, because those queues who has no jobs in them will not be iterated using
for _, attr := range pp.queueOpts
If no job in queue, it will be zero, right?
It will still has same problems, because those queues who has no jobs in them will not be iterated using
for _, attr := range pp.queueOptsIf no job in queue, it will be zero, right?
no, those queues aren't included in pp.queueOpts.
It will still has same problems, because those queues who has no jobs in them will not be iterated using
for _, attr := range pp.queueOptsIf no job in queue, it will be zero, right?
no, those queues aren't included in pp.queueOpts.
I mean that queue not present in pp.queueOpts indicates its job number is zero and we can update them to zero directly.