aws-sdk-rust icon indicating copy to clipboard operation
aws-sdk-rust copied to clipboard

Clients cache open sockets for up to 20 seconds after last request

Open stevepryde opened this issue 3 years ago • 6 comments
trafficstars

Describe the bug

Each instance of aws_sdk_*::Client seems to cache potentially hundreds of open sockets that stay around for up to 20 seconds after use. This becomes a major problem on Lambda where your app has a file handle limit of 1024, especially when using a number of different AWS clients to do lots of parallel operations in the same app (each with their own cache).

This can be mitigated slightly by dropping clients and recreating as needed but that's not always feasible and is probably also less efficient.

Expected Behavior

Either sockets should always be closed after every request, or the caller should be able to customize this behaviour. Or perhaps a way to share the socket pool between multiple clients.

Current Behavior

After performing 50x S3 HeadObject requests concurrently, the S3 Client struct keeps all 50 file handles (sockets) open for up to 20 seconds, even though the requests complete almost instantly. I assume all other Client structs behave similarly.

Reproduction Steps

I can reproduce this by spawning concurrent S3 HeadObject tasks and then polling the number of file handles used. I assume any Client and any operation will do similar.

let s3 = aws_sdk_s3::Client::new(&config);
let mut futs = Vec::new();
for _ in 0..50 {
    futs.push(tokio::spawn(
        s3.head_object().bucket(bucket).key(key).send(),
    ));
}

for fut in futs {
    fut.await.unwrap().unwrap();
}

for countdown in 0..30 {
    tokio::time::sleep(Duration::from_secs(1)).await;
    let fd_count = procfs::process::Process::myself()
        .unwrap()
        .fd_count()
        .unwrap();
    info!("fd_count = {fd_count} after {countdown} seconds");
    if fd_count < 20 {
        break;
    }
}

Possible Solution

I haven't taken a deep enough dive into the client yet but something seems to be caching open sockets for up to 20 seconds. These do get reused if using the same client, but it would be good to have a way to avoid caching them at all.

Additional Information/Context

No response

Version

├── aws-config v0.49.0
│   │   ├── aws-http v0.49.0
│   │   │   ├── aws-smithy-http v0.49.0
│   │   │   │   ├── aws-smithy-eventstream v0.49.0
│   │   │   │   │   ├── aws-smithy-types v0.49.0
│   │   │   │   ├── aws-smithy-types v0.49.0 (*)
│   │   │   ├── aws-smithy-types v0.49.0 (*)
│   │   │   ├── aws-types v0.49.0
│   │   │   │   ├── aws-smithy-async v0.49.0
│   │   │   │   ├── aws-smithy-client v0.49.0
│   │   │   │   │   ├── aws-smithy-async v0.49.0 (*)
│   │   │   │   │   ├── aws-smithy-http v0.49.0 (*)
│   │   │   │   │   ├── aws-smithy-http-tower v0.49.0
│   │   │   │   │   │   ├── aws-smithy-http v0.49.0 (*)
│   │   │   │   │   ├── aws-smithy-types v0.49.0 (*)
│   │   │   │   ├── aws-smithy-http v0.49.0 (*)
│   │   │   │   ├── aws-smithy-types v0.49.0 (*)
│   ├── aws-sdk-s3 v0.19.0
│   │   ├── aws-endpoint v0.49.0 (*)
│   │   ├── aws-http v0.49.0 (*)
│   │   ├── aws-sig-auth v0.49.0 (*)
│   │   ├── aws-sigv4 v0.49.0 (*)
│   │   ├── aws-smithy-async v0.49.0 (*)
│   │   ├── aws-smithy-checksums v0.49.0
│   │   │   ├── aws-smithy-http v0.49.0 (*)
│   │   │   ├── aws-smithy-types v0.49.0 (*)
│   │   ├── aws-smithy-client v0.49.0 (*)
│   │   ├── aws-smithy-eventstream v0.49.0 (*)
│   │   ├── aws-smithy-http v0.49.0 (*)
│   │   ├── aws-smithy-http-tower v0.49.0 (*)
│   │   ├── aws-smithy-types v0.49.0 (*)
│   │   ├── aws-smithy-xml v0.49.0 (*)
│   │   ├── aws-types v0.49.0 (*)
│   ├── aws-types v0.49.0 (*)

Environment details (OS name and version, etc.)

AWS Lambda (Amazon Linux 2 x86_64) and also Mac OS 12.6 M1/32G

Logs

No response

stevepryde avatar Oct 10 '22 05:10 stevepryde

The SDK is using the hyper defaults for connection pooling right now, and it looks like by default, hyper doesn't have an upper bound on idle connections.

I'm not sure what the correct fix is, but its definitely seems like something the SDK needs solve.

It is possible to work around this by replacing the default connector so that you can change hyper's defaults, but this is definitely not easy, and we don't have a good example for it currently.

jdisanti avatar Oct 10 '22 21:10 jdisanti

Adding this to our Production Readiness tracking issue.

jdisanti avatar Oct 10 '22 21:10 jdisanti

It looks like the AWS SDK for Java V1 defaults to 50 idle connections with a max idle time of 60 seconds, and V2 is consistent.

jdisanti avatar Oct 10 '22 22:10 jdisanti

I think you can workaround this issue with the following connector customization for now:

aws-config = "0.49.0"
aws-sdk-dynamodb = "0.19.0"
aws-sdk-s3 = "0.19.0"
aws-smithy-client = "0.49.0"
hyper = { version = "0.14.20", features = ["full"] }
tokio = { version = "1.21.2", features = ["full"] }
use aws_sdk_dynamodb as dynamodb;
use aws_sdk_s3 as s3;
use aws_smithy_client::erase::DynConnector;
use aws_smithy_client::hyper_ext::Adapter as HyperAdapter;

fn create_smithy_conn() -> DynConnector {
    // The `DynConnector` results in an allocation and dynamic dispatch for all client calls,
    // so you may desire to type out the full type name for the return value instead.
    DynConnector::new(
        // The `HyperAdapter` converts a hyper connector into a Smithy connector that can be used with the SDK
        HyperAdapter::builder()
            .hyper_builder({
                // Tell the `HyperAdapter` to set max idle connections on the underlying hyper client
                let mut hyper_builder = hyper::Client::builder();
                hyper_builder.pool_max_idle_per_host(50);
                hyper_builder
            })
            // Use the default/built-in HTTPS connector
            .build(aws_smithy_client::conns::https()),
    )
}

#[tokio::main]
async fn main() {
    let sdk_config = aws_config::load_from_env().await;

    // Construct clients with the customized connectors
    let s3_client = s3::Client::from_conf_conn((&sdk_config).into(), create_smithy_conn());
    let dynamodb_client =
        dynamodb::Client::from_conf_conn((&sdk_config).into(), create_smithy_conn());

    println!("{:?}", s3_client.list_buckets().send().await);
    println!("{:?}", dynamodb_client.list_tables().send().await);
}

If you try that, let me know if it improves things.

jdisanti avatar Oct 11 '22 18:10 jdisanti

That works nicely. Thank you 😄

stevepryde avatar Oct 11 '22 23:10 stevepryde

A fix for this will go out in the next release.

jdisanti avatar Dec 08 '22 01:12 jdisanti

This was included in release-2022-12-14.

jdisanti avatar Dec 21 '22 22:12 jdisanti

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.

github-actions[bot] avatar Dec 21 '22 22:12 github-actions[bot]