wolff icon indicating copy to clipboard operation
wolff copied to clipboard

Handle initial connection failure

Open SukhikhN opened this issue 2 years ago • 1 comments

I start wolff producers during the app initialization with wolff:ensure_supervised_producers/3, but it fails if my kafka server is unavailable. I need to start them anyway so they will queue messages via replayq and replay the messages when connection will be established, just like when connection to kafka lost after app start.

Is there any way to achieve this with wolff?

I see that in previous versions producers were kept running after initial connection failure, but this commit https://github.com/kafka4beam/wolff/commit/e45b4ed9ff03d52ac23ef61beeba532ef5ba9a39 changed this behavior, and I don't fully understand why.

SukhikhN avatar Oct 06 '22 12:10 SukhikhN

Hi @SukhikhN The challenge might be, if Kafka is not available, there is no way to know the number of partitions etc. So it cannot start the producers. The commit https://github.com/kafka4beam/wolff/commit/e45b4ed9ff03d52ac23ef61beeba532ef5ba9a39 made the deletion of the producers if failed to initialize, before this commit, the producers process will keep waiting for the initialization to complete and then hopefully start working when Kafka is back. However the old behavior does not solve your problem either, because replayq would not be ready until Kafka is back, and the producers to get initialized.

To make it more resilient to Kafka failures, we would need to detach replayq buffer from the producer processes (not depend on the number of partitions etc), which seem to be quite a big refactoring.

zmstone avatar Aug 02 '23 14:08 zmstone