Tango icon indicating copy to clipboard operation
Tango copied to clipboard

Fix crash on missing job

Open bstriner opened this issue 6 years ago • 1 comments

Hi guys,

The jobManager was occasionally crashing on makeDead.

"'NoneType' object has no attribute 'id'" on the line including makeDead.

If the job is None, that must mean that getNextPendingJobReuse returned a job id that jobqueue.get couldn't get.

Anyways, this exception was happening in the except block so actually killed the jobManager. Added an if statement to avoid the crash. Would maybe be better to have an inner try block just to make the thing more resilient.

Doesn't address the core problem of why the job is missing, but fixes my problems for now.

Cheers!

bstriner avatar Mar 18 '18 20:03 bstriner

Would something like this make sense? Try to invalidate the original job when getNextPendingJobReuse fails.

id, vm = self.jobQueue.getNextPendingJobReuse(id)
if id is None:
  self.jobQueue.makeDead(job.id, "getNextPendingJobReuse failed")
  continue

bstriner avatar Mar 18 '18 21:03 bstriner