polkadot icon indicating copy to clipboard operation
polkadot copied to clipboard

Request/Response adaptive timeouts

Open rphmeier opened this issue 3 years ago • 2 comments

At the moment, the maximum timeout and response size are set for the entire request/response protocol. In fact, when the request is made with enough context it should be possible to configure the maximum timeout and response size accordingly. This would let us write more sophisticated networking protocols.

rphmeier avatar Sep 14 '22 00:09 rphmeier

What are the exact goals - as in why is the timeout important/needed? The timeout in substrate is a hard cap, if it is hit we will cancel the existing download even if it already reached like 99% - wasting all the effort. The timeout has the potential of completely crippling a protocol if set too tight (no peer able to provide the data within the timeout).

What we used in the past, e.g. in availability-recovery is the notion of a soft timeout, when hit, we would start additional parallel requests, but leave the old ones running. Would that be applicable here?

eskimor avatar Sep 21 '22 09:09 eskimor

Depending on the exact requirements, another option is also to play with queue sizes. For good distribution of load, if queue sizes on honest nodes are relatively small, we will get an error immediately when the peer is under load and don't have to waste time waiting for the timeout. In this scenario the timeout mostly exists to minimize harm malicious peers*) can have, hence we should be able to make it relatively generous.

*) and long latency between two particular peers.

eskimor avatar Sep 21 '22 09:09 eskimor

We could've subchunks in availability, complete with deeper merkle proof, if parablock size were every problematic for downloads.

burdges avatar Sep 23 '22 01:09 burdges

Yes, we can build higher-level timeout logic on top. The main goal is to do stuff like exponential back-off on requests and attempts and start with low timeouts with certain peers and move to higher ones.

rphmeier avatar Sep 23 '22 05:09 rphmeier

Ok, I was actually aiming at one level deeper. Why do we want that exponential back-off, starting with low timeouts?

eskimor avatar Sep 23 '22 12:09 eskimor

It may be useful in #5999, for instance. Exponential back-off or other back-offs are useful in general as a tool in networking protocols, so it's good to make sure the low-level code can support such things.

rphmeier avatar Sep 23 '22 21:09 rphmeier