exq icon indicating copy to clipboard operation
exq copied to clipboard

Add support for explicitly retrying jobs

Open hkrutzer opened this issue 6 years ago • 3 comments

We are using Exq with jobs that fail often because of factors out of our control. Sometimes a job needs to be retried 30 times over a long period. However, all failed jobs produce a log statement of a crashed process. These logs are undesirable because the failure is expected, and it pollutes the logfiles. This patch contains a proposal to allow jobs to flag themselves as failed by returning a tuple of a certain shape. Let me know what you think. I am also open to other options that produce the same result :)

hkrutzer avatar Oct 31 '19 16:10 hkrutzer

Coverage Status

Coverage decreased (-0.06%) to 90.976% when pulling ac9bfefa49f1669af35d8d0966ee23e2b5e3ec78 on hkrutzer:explicit-retry into 178787638d897f0216f85590a35f01bcbc9f95bc on akira:master.

coveralls avatar Oct 31 '19 16:10 coveralls

We have a similar use case and we raise ExpectedError from within the worker. We are ok with sending those logs to Kibana/Graylog, but not ok with sending it to error reporting service like Raygun.

The way we handle it right now is by filtering the errors that are sent to Raygun by implementing custom logger backend. I would expect similar situation will arise in other places also, where what might be considered as an error by one user might not be considered as an error by another user.

IMHO, as a library, we can't try to handle all these extra cases and these are better handled at the logger layer.

ananthakumaran avatar Nov 01 '19 04:11 ananthakumaran

I see where you are coming from and the library can't handle every possible feature. However, I feel that it is within the Elixir way of doing things to support an {:error, _} tuple (so perhaps not :retry). Also, the overall complexity of retrying when an error tuple is returned in Exq is a lot lower than writing an entire logger backend. Of course one could fork one and modify it, but then it needs to be maintained etc. Furthermore, we are looking at Sentry and this has a custom logger backend, so we would need to customize two loggers.

I would expect similar situation will arise in other places also, where what might be considered as an error by one user might not be considered as an error by another user.

While I agree from a theoretical pov, I don't think this actually happens that often (at least I haven't encountered it), because in most situations one can surround some code by a try/rescue block, or to go the other way around, raise when :error is returned.

hkrutzer avatar Nov 12 '19 15:11 hkrutzer