volcano
volcano copied to clipboard
feat: add volcano jobs phase metric
fix: #2493
I add all job phase shows in job apis.
const (
// Pending is the phase that job is pending in the queue, waiting for scheduling decision
Pending JobPhase = "Pending"
// Aborting is the phase that job is aborted, waiting for releasing pods
Aborting JobPhase = "Aborting"
// Aborted is the phase that job is aborted by user or error handling
Aborted JobPhase = "Aborted"
// Running is the phase that minimal available tasks of Job are running
Running JobPhase = "Running"
// Restarting is the phase that the Job is restarted, waiting for pod releasing and recreating
Restarting JobPhase = "Restarting"
// Completing is the phase that required tasks of job are completed, job starts to clean up
Completing JobPhase = "Completing"
// Completed is the phase that all tasks of Job are completed
Completed JobPhase = "Completed"
// Terminating is the phase that the Job is terminated, waiting for releasing pods
Terminating JobPhase = "Terminating"
// Terminated is the phase that the job is finished unexpected, e.g. events
Terminated JobPhase = "Terminated"
// Failed is the phase that the job is restarted failed reached the maximum number of retries.
Failed JobPhase = "Failed"
)
The metric record event happend in jobInformer.Informer()
received event.
cc.jobInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{
AddFunc: cc.addJob,
UpdateFunc: cc.updateJob,
DeleteFunc: cc.deleteJob,
})
And the processing of record metrics is not use cache lock
that maybe produces some inaccurate data but will improve some performance.
But if it traverses all jobs every time during updates, I think there might be some trouble. Should we switch to incremental updates or scheduled metrics at intervals?