aws-sdk-rust
aws-sdk-rust copied to clipboard
Make SO_NODELAY configurable (or default to true) as it massively reduces Lambda invocation latency
Describe the feature
The h2 library that backs this SDK seems to put HEADERS and DATA frames into separate TCP packets when payloads are large. That's fine, but in combination with Nagle's algorithm, it means that the second packet containing the DATA frame will not leave the client machine until an ACK is received for the packet containing the HEADERS frame. In my tests this appears to add substantial latency to Lambda invocations. Perhaps on the AWS side there is tcp delayed acknowledgement going on?
This can be resolved by setting SO_NODELAY (this can be set via a method on the hyper HttpConnector). In general this flag reduces latency at the expense of potentially more packets, but in this case it seems to make no difference to the number of packets, but massively reduces latency. In Go this flag defaults to true.
Benchmarks against a hello world Lambda
delays/delay time: [90.039 ms 91.792 ms 93.627 ms]
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) low severe
2 (2.00%) high mild
4 (4.00%) high severe
delays/nodelay time: [33.833 ms 34.452 ms 35.147 ms]
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe
Use Case
Reducing latency for AWS API calls
Proposed Solution
Either a new config variable, or simply default it to true.
Other Information
As a workaround, you can do this:
static HTTPS_NATIVE_ROOTS: Lazy<HttpsConnector<HttpConnector>> = Lazy::new(|| {
let mut http = HttpConnector::new();
// HttpConnector won't enforce scheme, but HttpsConnector will
http.enforce_http(false);
// Set SO_NODELAY, which we have found significantly improves Lambda invocation latency
http.set_nodelay(true);
hyper_rustls::HttpsConnectorBuilder::new()
.with_tls_config(
rustls::ClientConfig::builder()
.with_cipher_suites(&[
// TLS1.3 suites
rustls::cipher_suite::TLS13_AES_256_GCM_SHA384,
rustls::cipher_suite::TLS13_AES_128_GCM_SHA256,
// TLS1.2 suites
rustls::cipher_suite::TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,
rustls::cipher_suite::TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,
rustls::cipher_suite::TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,
rustls::cipher_suite::TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,
rustls::cipher_suite::TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256,
])
.with_safe_default_kx_groups()
.with_safe_default_protocol_versions()
.expect("Error with the TLS configuration. Please file a bug report under https://github.com/restatedev/restate/issues.")
.with_native_roots()
.with_no_client_auth()
)
.https_or_http()
.enable_http1()
.enable_http2()
.wrap_connector(http)
});
let mut config = aws_config::defaults(BehaviorVersion::latest());
config = config.http_client(HyperClientBuilder::new().build(HTTPS_NATIVE_ROOTS.clone()));
Acknowledgements
- [X] I may be able to implement this feature request
- [ ] This feature might incur a breaking change
A note for the community
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue, please leave a comment
Hey @jackkleeman, thanks for submitting this issue. We'll add it to our backlog.
I looked at this and came up with a few findings:
- SO_NODELAY and TCP_NODELAY are two names for the same thing.
- We should add a new
nodelay
setting to theHttpConnectorSettings
andHttpConnectorSettingsBuilder
structs. - I'm thinking that this value would be passed to the default hyper connector in
HyperConnectorBuilder::build
.
Currently, adding support won't be simple because (afaict) we don't have easy access to the hyper::client::connect::HttpConnector
used in our TLS connector meaning we can't just call HttpConnector::set_nodelay.
One last thing to note: in hyper v1.0, HttpConnector
got moved to hyper-util
.
Yes, given that the Https connector is currently in a lazy static and then just cloned, the only easy thing to do here is to change the default, which I guess is a big decision
As a workaround, users can produce their own https connector by copying and pasting the defaults used by aws, and provide it at config time. Thats how I have fixed it on my end
if you have a snippet could you paste it on this issue? That will be very helpful for other folks looking for the same behavior
Done @rcoh