Tango
Tango copied to clipboard
Fix crash on missing job
Hi guys,
The jobManager was occasionally crashing on makeDead.
"'NoneType' object has no attribute 'id'" on the line including makeDead.
If the job is None, that must mean that getNextPendingJobReuse returned a job id that jobqueue.get couldn't get.
Anyways, this exception was happening in the except block so actually killed the jobManager. Added an if statement to avoid the crash. Would maybe be better to have an inner try block just to make the thing more resilient.
Doesn't address the core problem of why the job is missing, but fixes my problems for now.
Cheers!
Would something like this make sense? Try to invalidate the original job when getNextPendingJobReuse fails.
id, vm = self.jobQueue.getNextPendingJobReuse(id)
if id is None:
self.jobQueue.makeDead(job.id, "getNextPendingJobReuse failed")
continue