quickwit
quickwit copied to clipboard
Retry strategy for gRPC clients
Exponential backoff + jitter + round robin on server pool
All endpoints are not idempotent, we need to work on that too.
Ideally I would see the retry at the tonic level. We need also to take into account that some gRPC clients use a tower balance service.
@guilload Can you complete this ticket
The retry work in tower is not going to land anytime soon, so we need a workaround solution in the meantime. Let's implement a RetryingMetastore that implements the usual exponential backoff with jitter retry logic. Let's replicate or reuse what we have in quickwit-aws.
Closed via #2335.