convoy
convoy copied to clipboard
Event delivery retry stop working after some time
I'm experiencing a strange behaviour when a delivery attempt fails.
Convoy correctly handle the retry mechanism (exponential backoff) that is setted up for the endpoint but it seems that the job, after some retry operation, stop to work and the last scheduled attempt is never picked up. The result is a retry event with "next attempt" date time that is in the past.
I'm currently using Convoy v23.06.1 but the same happened with the previous version.
Hey @achiarenza 👋🏿
Hmm, this might be a bug with the exponential backoff. I'll take a look at it.
Can you please help me with the steps to reproduce this?
Hello @jirevwe, for sure!
I'm currently using the docker compose file the repo provide to spin up Convoy, inside a Ubuntu 20.04.6 box.
Docker is version 23.0.5, build bc4487a.
Project settings are configured as you can see in the screenshot:
All the other configuration values are left as default.
In my tries to have the issue fixed I tried to scale up the docker worker instance to a number grater than one with docker compose up --scale worker=2 -d
but the problem persisted.
Let me know if you need some other info.
Thanks for the info,
The exponential back-off strategy uses the values from table below which go from 10secs to 15mins. All subsequent retries after the 7th retry will be about 15 mins apart.
10000 // 10 seconds
30000 // 30 seconds
60000 // 1 minute
180000 // 3 minutes
300000 // 5 minutes
600000 // 10 minutes
900000 // 15 minutes
This might make a 20 retry limit strategy take about 3 hours to reach the failure state. Can you please share the worker logs, so I can debug further?
In the meantime, can you re-test it with a smaller retry limit (about 5 to 10) because I can't seem to reproduce this.
The full docker log: worker.log with some info redacted.