queue icon indicating copy to clipboard operation
queue copied to clipboard

The job runner does not understand cancelled jobs

Open sbidoul opened this issue 2 years ago • 7 comments

Module

queue_job

Describe the bug

Since https://github.com/OCA/queue/pull/347, jobs have a cancelled state. However the job runner does not understand this state.

https://github.com/OCA/queue/blob/0bf3a1a48100fce68f3ece7aca1eee39001e59bb/queue_job/jobrunner/channels.py#L1048-L1061

This may lead to cancelled job filling up channels forever.

sbidoul avatar May 01 '23 09:05 sbidoul

@hbrunn @hparfr as you seem to use this feature, have you not noticed problems with it?

I still need to investigate in more depth but I'm under the impression that cancelled jobs fill up channels until the job runner is restarted.

sbidoul avatar May 01 '23 09:05 sbidoul

I didn't pay attention to this one. I have canceled jobs with an error message in one of my db :roll_eyes:

hparfr avatar May 02 '23 10:05 hparfr

Looking at places where DONE is used I notice the job dependencies mechanism. If I read correctly cancelled jobs cause their dependencies to stay WAIT_FOR_DEPENDENCIES until they are themselves cancelled or marked done. I guess that makes sense?

sbidoul avatar May 02 '23 16:05 sbidoul

I propose a fix for the jobrunner in #533

sbidoul avatar May 02 '23 17:05 sbidoul

@guewen if you have a moment, I'll welcome your advice on the effect of the cancelled states on the job dependencies mechanism. Maybe it's fine but I'm not entirely sure it was taken into account when porting the cancelled state feature from 12 to 14.

sbidoul avatar May 03 '23 07:05 sbidoul

Hmm yeah I'm not sure the current situation is alright. Cancelling a parent job leaves a situation where dependent jobs will never be executed I think. We have to cancel them or set them to done, which is ultimately the same. I wonder if cancelling a job in a graph shouldn't cancel all the dependent jobs or maybe even all the graph's jobs (this can be explained in the cancel wizard)?

guewen avatar May 03 '23 07:05 guewen

There hasn't been any activity on this issue in the past 6 months, so it has been marked as stale and it will be closed automatically if no further activity occurs in the next 30 days. If you want this issue to never become stale, please ask a PSC member to apply the "no stale" label.

github-actions[bot] avatar Nov 05 '23 12:11 github-actions[bot]