restate icon indicating copy to clipboard operation
restate copied to clipboard

Runtime crashes if Kafka subscriptions cannot be created

Open tillrohrmann opened this issue 3 months ago • 1 comments

Currently, the runtime crashes if the Kafka subscription controller cannot start or update a subscription. This can happen if the Kafka cluster information, provided by the configuration, is no longer available after having created the subscription. The way the server crashes is by saying that the worker is no longer reachable which is very confusing.

The underlying problem is that we don't persist the Kafka cluster information so that it can change/disappear across server restarts.

A short term fix could be to make the error more expressive.

tillrohrmann avatar Nov 10 '25 15:11 tillrohrmann

The underlying problem is that we don't persist the Kafka cluster information so that it can change/disappear across server restarts.

This needs to be spec-ed out. The kafka cluster config contains some secrets, and putting secrets in a config file/env variables allows people to use the various "secret storage" mechanisms in various deployment platforms. Even if we move some stuff in the metadata/schema registry (there was something related here btw https://github.com/restatedev/restate/issues/964), we would still need some mechanism to gather part of the configuration from the environment/files i suppose.

slinkydeveloper avatar Nov 10 '25 15:11 slinkydeveloper