machinery
machinery copied to clipboard
Infinite Fibonacii retry on broker connection failure
When the machinery worker fails to connect to the broker due to the renewal of the broker credentials, the following is logged :
amqp.go:152 Error occurred on queue: foo-machinery-task. Reconnecting
worker.go:84 Broker failed with error: ..."
After this, due to the retry mechanism of machinery, the connection is retried in the time intervals based on Fibonacci series and every time on failure this is logged :
worker.go:84 Broker failed with error: ..."
retry.go:20 Retrying in 5 seconds
It kept on retrying for hours until it was found by manual intervention.
I had the following queries regarding the above mechanism, it will be great if the maintainers/community can give their suggestions as to how to approach them.
- Since, we want to be alerted in case of broker connection failures and not keep retrying it infinitely thus, is there any workaround to regulate the amount of retries that happen? Currently, it happens infinitely in exponential time intervals. I personally feel that something like
RetryCount
for broker connection failures which we already have in the case of task failures will really help. - Is there any in-built way to restart the master machinery server instance in case it is detected that number of retries of broker connection failures has gone beyond a certain limit? Doing a server restart will help machinery use the newer broker credentials and thus, will prevent any broker connection failures.
Thanks for this great project and all the work put into it. Really excited for v2 as well.