Docs on BlockingClientBuilder::timeout() are misleading
The blocking client currently does not allow specifying a timeout for the entire request (i.e. until the request body is finished transferring), which means an established connection will hang indefinitely if the remote host doesn't stops replying. This leads to a resource leak, which may be used to cause denial of service.
While it's not necessary to set such timeout by default (most other HTTP clients don't), it must be possible to configure it for attohttpc to be usable for e.g. building reliable services that query a remiote API over HTTP, or for defending against DoS attacks where the remote host sends 1 byte per second, keeping the connection open indefinitely.
Consider an image hosting service like imgur.com: users enter URLs of images they want to host, and the HTTP client fetches the images and puts them into storage. If a user enters a URL to a malicious host that replies more often than the read timeout, but keeps replying indefinitely, this means that the connection will be open indefinitely. If an attacker keeps getting the service to open such connections, this will eventually lead to the exhaustion of RAM or network ports on the client.
Other blocking clients (e.g. attohttpc and ureq) allow setting such a timeout.
To clarify, the async client doesn't provide a timeout for the entire request either, but I suspect it should be possible to hand-roll it in the async case.
The blocking client builder has timeout, and blocking requests can override with a per-request timeout, which is applied over the whole request, from sending to receiving.
And yes, when async, it's very easy to wrap any future in something like tokio::time::timeout(future, duration).
Thanks for the quick response!
The timeout method on the blocking client builder is currently documented as follows:
Set a timeout for connect, read and write operations of a
Client. Default is 30 seconds. PassNoneto disable timeout.
I've interpreted it as a read timeout, i.e. that it would be reset every time new data is received from the server. Is that actually a timeout for the entire request? If so, perhaps the documentation should be clarified.
Hm, yea those comments are misleading. If I had to guess, I imagine it was written before async was a thing, and so it was just a setting passed to TcpStream::{connect_timeout, set_read_timeout, set_write_timeout}, and then we eventually repurposed it and never updated the docs.
Ah I see. I've renamed the issue to reflect that it's merely an issue with the documentation.
Is there a way to set the read timeout now? It can be useful e.g. when transferring large files, so a timeout for the entire request is not applicable, but you don't want to wait for 5 minutes if the connection cuts off mid-transfer.
Yes, we recently ran into this too. Something like having separate values for the server response timeout, and the read timeout of the data on the request (.text() or .json())
In our case, we do a join_all(requests), and then go on to read the text data. But if any of the requests time out, we won't be able to read the data on the valid request.
Perhaps this isn't best practice, we will refactor to read the streamed data within the promises we are joining on, but I wonder if this could be something in the repo.
There is now a separate read_timeout, but only on the async client. Adding it the blocking client is welcome.