Finished jobs ending up as failed with SolidQueue::Processes::ProcessPrunedError
Hey there 👋
We're seeing an issue where some finished jobs are unexpectedly ending up in the failed jobs morgue with the following error:
SolidQueue::Processes::ProcessPrunedError
some other jobs genuinely fail, but we're concerned about the false positives: finished jobs incorrectly marked as failed.
This is creating noise in our monitoring and makes it harder to distinguish real failures from completed jobs.
Any insight into why this might be happening, or how we can prevent it, would be greatly appreciated!
Thanks in advance!
rails (7.2.2.1) ruby (3.3.0) solid_queue (1.1.0)
Hey @ericwecasa, sorry about this! When you say they jobs are finished, what do you mean exactly?
Hi @rosa, the finished_at value is solid_queue_jobs table is filled out.
FYI, not sure if this information is relevant, we use the gem mission-control as a dashboard
Hey @ericwecasa, so sorry for the delay! I missed that last comment 😳
This is really strange! I can't imagine how this can happen, because for a job to fail with a ProcessPrunedError, its associated claimed_execution needs to exist and be associated with a process that has been pruned (found dead). This is the only place where jobs are failed in that way. However, a job is marked as finished in the same transaction as its claimed execution is deleted. This is why I was asking what you meant by finished, because I assumed it wouldn't be that the job had finished_at set.
The next time you get one of those, could you get screenshots of all the data of the failed job from Mission Control?