perron
perron copied to clipboard
Allow retry without aborting previous request + race retries
Background
When we retry in perron
currently, initial request (and subsequent retries except for the last one) get aborted. Let's say we dropRequestAfter
450ms and retries count is 2. If the response takes 500ms for each of the retries, we get total "requesting time" of ~1350ms (450 * 3) and still fail.
Imagine if instead, we don't abort retries and race them. This will lead to:
- Same overall timing for the worst case-scenario
- Better utilisation of connections in case of keep-alive (we don't destroy sockets in this case)
- Final promise with the response would be resolve at the end of the day
Unclear points
- If we resolve the promise with race, consumer might not know about all the retries which were ignored during racing (https://github.com/zalando-incubator/perron/issues/72).
Proposed changes
- [ ] Introduce
dropAllRequestsAfter
option, which would allow specifying max overall timing including retries (backwards compatible, we still keepdropRequestAfter
)
From this point, we can go towards the following:
- [ ] If
requestOptions.dropRequestAfter
anddropAllRequestsAfter
are provided together, then behaviour forrequestOptions.dropRequestAfter
will be slightly different than default behaviour: no request will be dropped until we reach the globaldropAllRequestsAfter
timer which means first request will have more time to complete while retries are in flights with multiple retries having each time less time to complete.
+
less options params
-
dropRequestAfter behaviour changes and is not what its name is
Or
- [ ] introduce
retryAfter
option for this dedicated functionality. WithdropAllRequestsAfter
andretry
(number) you can fine tune after which time you start retrying a request which took too long for some reason.
+
you can fine tune your retry behaviour even with dropRequestAfter
if you want
+
easier to manage complexity as dropRequestAfter
option param stays coupled to the request
-
introducing another option params
Ping @DeTeam and @jeremycolin for questions/concerns
@shuhei @ruiaraujo @grassator feedback is appreciated ^_^
That's a very good idea!
The only concern is that the number of concurrent connections will be multiplied by the retry count when the server doesn't respond. If the HTTP agent doesn't limit connection count (default behavior), it will be fine as long as the computer has enough resource (ports, file descriptors, memory, etc.)? If the HTTP agent limits connection count, it may push head-of-line-blocking more.
I can't predict what's gonna happen though...
@shuhei the user controls the agent behaviour and therefore the connection limits. It would be nice to include your comment as caveat for this strategy in the options IMO.