karapace
karapace copied to clipboard
Forwarding from secondary responds 500 if primary node is unreachable.
What did you expect to happen?
Retries by Karapace and ultimately 502, 503 or 503 depending on the condition if request still failed.
What else do we need to know?
Three node cluster was used where the error happened. The lowest node was selected as group coordinator but was not reachable, in this case the port 8081 was not open. A write operation to set global config failed when sent to secondary node and secondary tried to forward to primary.
Do we have ideas how this case when the primary node is unreachable can persist in normal operations? I'm thinking of how long the forwarding can be retried. It's not a great REST API behaviour either is the call block e.g. >10 secs because of these retries.