mimir
mimir copied to clipboard
Query-frontend graceful shutdown
Is your feature request related to a problem? Please describe.
The query-frontend ought to shut down gracefully, e.g. flushing in-flight queries first.
Describe the solution you'd like
For the query-frontend to react to SIGTERM, by first marking itself as not ready and then proceeding with its shutdown procedure (flushing in-flight queries, what else @pracucci?).
Additional context
Not shutting down gracefully is currently a problem wrt. restarting instances during e.g. rollouts, but it's becoming more relevant as instances will be terminated more frequently as we implement query-frontend auto-scaling
by first marking itself as not ready
This should be already covered by -shutdown-delay:
How long to wait between SIGTERM and shutdown. After receiving SIGTERM, Mimir will report not-ready status via /ready endpoint.
The idea is that we should configure -shutdown-delay to a reasonable value for query-frontend. When configured, the query-frontend will start failing the /ready endpoint right after the SIGTERM, but will wait the <delay> time before actually shutting down internal components. This <delay> should give time to K8S service to update the set of addresses of healthy query-frontends and clients (e.g. gateway) to discover the updated set of addresses.
flushing in-flight queries, what else @pracucci?
Just to be on the same page, we shouldn't flush in-flight queries but wait until they're done. This in addition to -shutdown-delay should be enough (but as usual we should test it and make sure we're not missing anything).
in-flight queries but wait until they're done
I checked the code and, if I'm not missing anything, I agree with you that we currently don't wait until in-flight queries are completed before terminating the query-frontend.
I looked at query-frontend v2, which is the one used with query-scheduler (what we run at Grafana Labs). Frontend.stopping() stop scheduler workers and waits until it terminated:
https://github.com/grafana/mimir/blob/5b529d27c4a0bfb0a4e4108b4376df2a855c483c/pkg/frontend/v2/frontend.go#L178-L180
The frontendSchedulerWorkers.stopping() iterates over each worker and call their stop():
https://github.com/grafana/mimir/blob/5b529d27c4a0bfb0a4e4108b4376df2a855c483c/pkg/frontend/v2/frontend_scheduler_worker.go#L102-L104
Then frontendSchedulerWorker.stop() cancels the worker context and wait until the worker has done:
https://github.com/grafana/mimir/blob/5b529d27c4a0bfb0a4e4108b4376df2a855c483c/pkg/frontend/v2/frontend_scheduler_worker.go#L260-L262
However, cancelling the worker context should mean the inflight query gets canceled too and not waited until completed. I've address it in the querier worker and addressing it was pretty complicated. Luckily, due to how the query-frontend works, I think we don't have to handle it here but what we could easily do is to wait until Frontend.enqueuedRequests is empty in Frontend.stopping() (before stopping the scheduler workers).
Describe the solution you'd like
For the query-frontend to react to SIGTERM, by first marking itself as not ready and then proceeding with its shutdown procedure (flushing in-flight queries, what else @pracucci?).
The query-frontend already does this via the -shutdown-delay flag (which is set by default in our internal jsonnet and should be promoted to the OSS jsonnet and Helm). The flag marks the query-frontend as not ready, disables keep-alive, and continues to handle requests for N seconds.
@pracucci I agree with your analysis on how the query-frontend stops, and doesn't wait for in-flight queries to finish after the shutdown delay elapses.
Though, I have the question of whether the query-frontend should wait even longer than -shutdown-delay, i.e. until all in-flight queries have finished? Doesn't this go against the description of -shutdown-delay ("How long to wait between SIGTERM and shutdown"), since query-frontend will potentially wait even longer (depending on how long queries take to finish)? I interpret "How long to wait between SIGTERM and shutdown" as meaning the service will actually shut down once -shutdown-delay has elapsed, although I acknowledge it might not be the intention behind?
Though, I have the question of whether the query-frontend should wait even longer than -shutdown-delay, i.e. until all in-flight queries have finished?
I think so. We don't want a query to fail just because it took 60s and the shutdown delay is set to < 60s.
Doesn't this go against the description of -shutdown-delay ("How long to wait between SIGTERM and shutdown")
I don't think so. The shutdown delay is the delay before the shutdown procedure is started. The shutdown procedure may take some time. For example, in the querier we wait for inflight queries to be completed in the shutdown procedure and this may take a while.
Thanks @pracucci, you confirm my suspicion that "shutdown" might be intended as non-immediate. Good to know it's in fact like this in the querier (waiting on queries to finish).
I can try to make the query-frontend wait on remaining queries to finish.