odd-jobs icon indicating copy to clipboard operation
odd-jobs copied to clipboard

Modifying exponential backoff for failures

Open aschmois opened this issue 4 years ago • 4 comments

Our team is giving odd-jobs a try and we're loving it! The only current concern we have is that we would like to avoid exponential backoff for failures since we only deal with transient failures. Would it be possible to add a config to modify it, such as adding a limit to the time added, or removing it altogether?

This is the line we are concerned with: https://github.com/saurabhnanda/odd-jobs/blob/b9649f5d524e7e9650e9f573c1b85e8b094acced/src/OddJobs/Job.hs#L408

I could open a PR to add a config but looks like the roadmap is already planning on doing something with failures per job don't want to step on toes.

aschmois avatar Oct 12 '20 19:10 aschmois

Hey @aschmois great to know that you're planning to use odd-jobs. Do comment at https://github.com/saurabhnanda/odd-jobs/issues/44 when your implementation finally makes it to production.

The only current concern we have is that we would like to avoid exponential backoff for failures since we only deal with transient failures. Would it be possible to add a config to modify it, such as adding a limit to the time added, or removing it altogether?

In a separate discussion thread, we have already established a need for evolving cfgJobRunner so that it can inform odd-jobs about how to re-queue/re-retry the job.

On similar lines I can see that your feature-request can be handled in two ways:

  1. Add a new cfgCalculateNextRunAt :: Job -> IO UTCTime function and use it at the appropriate place
  2. Or, change cfgOnJobFailed :: JobErrHandler NextJobAction, where NextJobAction is the same type as cfgJobRunner :: Job -> IO NextJobAction, and can probably look something like:
data NextJobAction = ActionSuccess | ActionFailed | ActionRetry UTCTime | ...

The second approach can be used to build cron-like functionality, as well.

What're your thoughts?

saurabhnanda avatar Oct 13 '20 07:10 saurabhnanda

This is a good feature to have. I've added this to the roadmap

saurabhnanda avatar Oct 13 '20 07:10 saurabhnanda

@aschmois any thoughts on my previous comments?

saurabhnanda avatar Oct 16 '20 17:10 saurabhnanda

@saurabhnanda I apologize, we discussed it as a team and we'd like to work on this. Don't know exactly when we'll get some free time but I assume it'll be soon! I think I'd like to take a look at the NextJobAction approach since it'll help us on some of the other oddities like scheduling a job for itself after it completes (cron-like jobs).

aschmois avatar Oct 16 '20 17:10 aschmois