karapace icon indicating copy to clipboard operation
karapace copied to clipboard

Forwarding from secondary responds 500 if primary node is unreachable.

Open jjaakola-aiven opened this issue 2 years ago • 1 comments

What did you expect to happen?

Retries by Karapace and ultimately 502, 503 or 503 depending on the condition if request still failed.

What else do we need to know?

Three node cluster was used where the error happened. The lowest node was selected as group coordinator but was not reachable, in this case the port 8081 was not open. A write operation to set global config failed when sent to secondary node and secondary tried to forward to primary.

jjaakola-aiven avatar Nov 22 '22 09:11 jjaakola-aiven

Do we have ideas how this case when the primary node is unreachable can persist in normal operations? I'm thinking of how long the forwarding can be retried. It's not a great REST API behaviour either is the call block e.g. >10 secs because of these retries.

juha-aiven avatar Nov 23 '22 07:11 juha-aiven