spincycle icon indicating copy to clipboard operation
spincycle copied to clipboard

Don't override job state when stopped

Open felixplajer opened this issue 6 years ago • 1 comments

Right now, when a job runner is stopped (as happens when the JR is shutting down + requests are being suspended), it overrides its job's return state, so the job can never return STATE_FAIL. I know the idea of overriding STATE_FAIL with STATE_STOPPED was that some of jobs probably just return failed when they should really be returning stopped, but with the overriding as is there’s no way to indicate if something really did go wrong when stopping a job and we don’t want to retry it / resume the request

Relevant code here: https://github.com/square/spincycle/blob/master/job-runner/runner/runner.go#L151-L156

felixplajer avatar Jun 06 '19 17:06 felixplajer

Hmm, this is more complicated than I originally thought. Even if a job returns Failed when Stop is called, the job might have some allowed retries left - so we would probably want to be able to suspend and then retry the job on resume. However, if the runner does return the state as Failed instead of Stopped, then the reaper as it is now will either do a sequence retry, or mark the whole request as failed.

felixplajer avatar Jul 24 '19 19:07 felixplajer