quickwit
quickwit copied to clipboard
Limit search requests concurrency
Reject search requests with TooManyRequests when too many requests are in flight. This should be done at the root search level.
I think we tend to use the following terminology:
- concurrency limiting: process only n request at a time, and have the others wait in queue.
- load shedding: returning an error right away if two many clients are waiting.
and we usually use them in combination.
Here I think @guilload wants load shedding. (we already have concurrency limiting, at least at the leaf search split level)