aws-sdk-js-v3 icon indicating copy to clipboard operation
aws-sdk-js-v3 copied to clipboard

SecretManager | List secrets on `ap-northeast-1`: connect ETIMEDOUT/ENETUNREACH

Open loganmzz opened this issue 1 year ago • 8 comments

Checkboxes for prior research

Describe the bug

I have written a tool trying to resolve existing Terraform resource.

Everything is working fine with many services (Secret Manager, Security Group, EventBridge) and two regions (eu-central-1 and ap-northeast-1) but listing secrets on ap-northeast-1.

const sm = new SecretsManager({region: 'ap-northeast-1'});
const listSecrets = await sm.listSecrets();

It results in:

AggregateError
        at internalConnectMultiple (node:net:1114:18)
        at internalConnectMultiple (node:net:1177:5)
        at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
        at listOnTimeout (node:internal/timers:575:11)
        at process.processTimers (node:internal/timers:514:7) {
      code: 'ETIMEDOUT',
      name: 'TimeoutError',
      '$metadata': { attempts: 3, totalRetryDelay: 209 },
      [errors]: [
        Error: connect ETIMEDOUT 18.179.209.155:443
            at createConnectionError (node:net:1634:14)
            at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
            at listOnTimeout (node:internal/timers:575:11)
            at process.processTimers (node:internal/timers:514:7) {
          errno: -110,
          code: 'ETIMEDOUT',
          syscall: 'connect',
          address: '18.179.209.155',
          port: 443
        },
        Error: connect ENETUNREACH 2406:da14:afa:8a00:e3de:c82b:7b57:b9a4:443 - Local (:::0)
            at internalConnectMultiple (node:net:1176:40)
            at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
            at listOnTimeout (node:internal/timers:575:11)
            at process.processTimers (node:internal/timers:514:7) {
          errno: -101,
          code: 'ENETUNREACH',
          syscall: 'connect',
          address: '2406:da14:afa:8a00:e3de:c82b:7b57:b9a4',
          port: 443
        },
        Error: connect ETIMEDOUT 18.179.211.244:443
            at createConnectionError (node:net:1634:14)
            at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
            at listOnTimeout (node:internal/timers:575:11)
            at process.processTimers (node:internal/timers:514:7) {
          errno: -110,
          code: 'ETIMEDOUT',
          syscall: 'connect',
          address: '18.179.211.244',
          port: 443
        },
        Error: connect ENETUNREACH 2406:da14:afa:8a01:bfa8:ff5e:a40a:74af:443 - Local (:::0)
            at internalConnectMultiple (node:net:1176:40)
            at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
            at listOnTimeout (node:internal/timers:575:11)
            at process.processTimers (node:internal/timers:514:7) {
          errno: -101,
          code: 'ENETUNREACH',
          syscall: 'connect',
          address: '2406:da14:afa:8a01:bfa8:ff5e:a40a:74af',
          port: 443
        },
        Error: connect ETIMEDOUT 52.198.52.244:443
            at createConnectionError (node:net:1634:14)
            at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
            at listOnTimeout (node:internal/timers:575:11)
            at process.processTimers (node:internal/timers:514:7) {
          errno: -110,
          code: 'ETIMEDOUT',
          syscall: 'connect',
          address: '52.198.52.244',
          port: 443
        },
        Error: connect ENETUNREACH 2406:da14:afa:8a02:4b65:ecb5:dfe8:2f13:443 - Local (:::0)
            at internalConnectMultiple (node:net:1176:40)
            at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
            at listOnTimeout (node:internal/timers:575:11)
            at process.processTimers (node:internal/timers:514:7) {
          errno: -101,
          code: 'ENETUNREACH',
          syscall: 'connect',
          address: '2406:da14:afa:8a02:4b65:ecb5:dfe8:2f13',
          port: 443
        }
      ]
    }

Tried with AWS CLI I have no issue:

AWS_REGION=ap-northeast-1 aws secretsmanager list-secrets --filters 'Key=name,Values=...'

Regression Issue

  • [ ] Select this option if this issue appears to be a regression.

SDK version number

@aws-sdk/[email protected]

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

v20.10.0

Reproduction Steps

const sm = new SecretsManager({region: 'ap-northeast-1'});
const listSecrets = await sm.listSecrets();

Observed Behavior

Error:

AggregateError
    at internalConnectMultiple (node:net:1114:18)
    at internalConnectMultiple (node:net:1177:5)
    at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
    at listOnTimeout (node:internal/timers:575:11)
    at process.processTimers (node:internal/timers:514:7) {
  code: 'ETIMEDOUT',
  name: 'TimeoutError',
  '$metadata': { attempts: 3, totalRetryDelay: 157 },
  [errors]: [
    Error: connect ETIMEDOUT 18.179.209.155:443
        at createConnectionError (node:net:1634:14)
        at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
        at listOnTimeout (node:internal/timers:575:11)
        at process.processTimers (node:internal/timers:514:7) {
      errno: -110,
      code: 'ETIMEDOUT',
      syscall: 'connect',
      address: '18.179.209.155',
      port: 443
    },
    Error: connect ENETUNREACH 2406:da14:afa:8a00:e3de:c82b:7b57:b9a4:443 - Local (:::0)
        at internalConnectMultiple (node:net:1176:40)
        at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
        at listOnTimeout (node:internal/timers:575:11)
        at process.processTimers (node:internal/timers:514:7) {
      errno: -101,
      code: 'ENETUNREACH',
      syscall: 'connect',
      address: '2406:da14:afa:8a00:e3de:c82b:7b57:b9a4',
      port: 443
    },
    Error: connect ETIMEDOUT 18.179.211.244:443
        at createConnectionError (node:net:1634:14)
        at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
        at listOnTimeout (node:internal/timers:575:11)
        at process.processTimers (node:internal/timers:514:7) {
      errno: -110,
      code: 'ETIMEDOUT',
      syscall: 'connect',
      address: '18.179.211.244',
      port: 443
    },
    Error: connect ENETUNREACH 2406:da14:afa:8a02:4b65:ecb5:dfe8:2f13:443 - Local (:::0)
        at internalConnectMultiple (node:net:1176:40)
        at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
        at listOnTimeout (node:internal/timers:575:11)
        at process.processTimers (node:internal/timers:514:7) {
      errno: -101,
      code: 'ENETUNREACH',
      syscall: 'connect',
      address: '2406:da14:afa:8a02:4b65:ecb5:dfe8:2f13',
      port: 443
    },
    Error: connect ETIMEDOUT 52.198.52.244:443
        at createConnectionError (node:net:1634:14)
        at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
        at listOnTimeout (node:internal/timers:575:11)
        at process.processTimers (node:internal/timers:514:7) {
      errno: -110,
      code: 'ETIMEDOUT',
      syscall: 'connect',
      address: '52.198.52.244',
      port: 443
    },
    Error: connect ENETUNREACH 2406:da14:afa:8a01:bfa8:ff5e:a40a:74af:443 - Local (:::0)
        at internalConnectMultiple (node:net:1176:40)
        at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
        at listOnTimeout (node:internal/timers:575:11)
        at process.processTimers (node:internal/timers:514:7) {
      errno: -101,
      code: 'ENETUNREACH',
      syscall: 'connect',
      address: '2406:da14:afa:8a01:bfa8:ff5e:a40a:74af',
      port: 443
    }
  ]
}

Expected Behavior

Valid HTTP response

Possible Solution

N/A

Additional Information/Context

No response

loganmzz avatar Dec 02 '24 18:12 loganmzz

Hey @loganmzz ,

Thanks for the feedback! However I can't reproduce this issue.

This is the code I have -

import { SecretsManager } from '@aws-sdk/client-secrets-manager';

const client = new SecretsManager({region: 'ap-northeast-1'});

const response = await client.listSecrets();
console.log(response)

And the result is 200.

{
  '$metadata': {
    httpStatusCode: 200,
    requestId: 'f40c7c36-63d2-459b-a6e3-XXX',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  },
  SecretList: [
    {
      ARN: 'arn:aws:secretsmanager:ap-northeast-1:XXX:secret:prod/test/secret-XXX',
      CreatedDate: 2024-12-04T23:07:09.893Z,
      LastChangedDate: 2024-12-04T23:07:10.145Z,
      Name: 'prod/test/secret',
      SecretVersionsToStages: [Object],
      Tags: []
    }
  ]
}

This error indicates a network timeout when trying to connect to AWS Secrets Manager. You can manually setup timeout and retry attempts -


const client = new SecretsManager({
    region: 'ap-northeast-1',
    logger: console,
    maxAttempts: 3,
    retryMode: 'standard',
    requestTimeout: 3000,
    connectTimeout: 3000
});

Or if you think I miss anything, please add additional info.

Thanks!

zshzbh avatar Dec 04 '24 23:12 zshzbh

Looks like timeouts are ignored. I increased them to 50000 and I still facing same issue in only 3s.

I run it several times (~ 10), and it passed once... May be just need to "really" increase timeout :( Any other way to increase them?

loganmzz avatar Dec 05 '24 13:12 loganmzz

Just to be sure the information is shared, I have no issue with CLI (Python?)

loganmzz avatar Dec 05 '24 13:12 loganmzz

@zshzbh At stated previously your setup doesn't seem to affect timeout. Any other suggestion on how to setup custom timeouts?

loganmzz avatar Dec 18 '24 08:12 loganmzz

There's some steps I recommend -

  1. Checking secrets in ap-northeast-1: I don't actually have access to check AWS secrets or configurations. You'll need to verify this in your own AWS account using the AWS CLI or console.

  2. Terraform and network connectivity: You're right that Terraform is a third-party tool. When using it with AWS, it's a good practice to check network connectivity. You can do this by:

    • Ensuring your local machine can reach AWS endpoints
    • Verifying your AWS credentials are correctly configured
    • Running a simple AWS CLI command to test connectivity
  3. Adding middleware to the code: The middleware you provided looks good. It will log the context, input, and output of AWS SDK calls. Here's how you can add it to your AWS SDK client:

import { S3Client } from "@aws-sdk/client-s3";

const client = new S3Client({ region: "ap-northeast-1" });

client.middlewareStack.add(
  (next, context) => async (args) => {
    console.log("AWS SDK context", context.clientName, context.commandName);
    console.log("AWS SDK request input", args.input);
    const result = await next(args);
    console.log("AWS SDK request output:", result.output);
    return result;
  },
  {
    name: "MyMiddleware",
    step: "build",
    override: true,
  }
);
  1. Setting up higher max attempts: You can increase the maximum number of retry attempts by configuring the retry strategy. Here's an example:
import { S3Client } from "@aws-sdk/client-s3";
import { NodeHttpHandler } from "@aws-sdk/node-http-handler";
import { RetryStrategy } from "@aws-sdk/util-retry";

const client = new S3Client({
  region: "ap-northeast-1",
  maxAttempts: 5,  // Increase this number as needed
  requestHandler: new NodeHttpHandler({
    connectionTimeout: 5000,  // 5 seconds
    socketTimeout: 5000,  // 5 seconds
  }),
  retryStrategy: new RetryStrategy(() => Promise.resolve(3000)),  // 3 seconds delay between retries
});
  1. Sharing reproducible code: If the above steps don't resolve the issue, please provide:
    • The complete Terraform configuration file
    • Any relevant AWS SDK code you're using
    • Full error messages and stack traces
    • The exact CLI commands you're running
    • Any environment variables or AWS credentials setup

zshzbh avatar Dec 18 '24 23:12 zshzbh

There's some steps I recommend -

  1. Checking secrets in ap-northeast-1: I don't actually have access to check AWS secrets or configurations. You'll need to verify this in your own AWS account using the AWS CLI or console.

No issue here. Everything is working fine with other clients (CLI, Terraform) and other services.

  1. Terraform and network connectivity: You're right that Terraform is a third-party tool. When using it with AWS, it's a good practice to check network connectivity. You can do this by:

    • Ensuring your local machine can reach AWS endpoints
    • Verifying your AWS credentials are correctly configured
    • Running a simple AWS CLI command to test connectivity

As stated it was already validated.

  1. Adding middleware to the code: The middleware you provided looks good. It will log the context, input, and output of AWS SDK calls. Here's how you can add it to your AWS SDK client:
import { S3Client } from "@aws-sdk/client-s3";

const client = new S3Client({ region: "ap-northeast-1" });

client.middlewareStack.add(
  (next, context) => async (args) => {
    console.log("AWS SDK context", context.clientName, context.commandName);
    console.log("AWS SDK request input", args.input);
    const result = await next(args);
    console.log("AWS SDK request output:", result.output);
    return result;
  },
  {
    name: "MyMiddleware",
    step: "build",
    override: true,
  }
);

Strange... As long as I run tests without middleware I had the issue. After adding it, I had no more timeout issue... I removed middleware it still works, resetting networking connection it still works. I will try after rebooting later on.

  1. Setting up higher max attempts: You can increase the maximum number of retry attempts by configuring the retry strategy. Here's an example:
import { S3Client } from "@aws-sdk/client-s3";
import { NodeHttpHandler } from "@aws-sdk/node-http-handler";
import { RetryStrategy } from "@aws-sdk/util-retry";

const client = new S3Client({
  region: "ap-northeast-1",
  maxAttempts: 5,  // Increase this number as needed
  requestHandler: new NodeHttpHandler({
    connectionTimeout: 5000,  // 5 seconds
    socketTimeout: 5000,  // 5 seconds
  }),
  retryStrategy: new RetryStrategy(() => Promise.resolve(3000)),  // 3 seconds delay between retries
});

I will give it a try if I can reproduce error.

  1. Sharing reproducible code: If the above steps don't resolve the issue, please provide:

    • The complete Terraform configuration file
    • Any relevant AWS SDK code you're using
    • Full error messages and stack traces
    • The exact CLI commands you're running
    • Any environment variables or AWS credentials setup

Here my test script:

const client = require('@aws-sdk/client-secrets-manager');

async function main() {
  const sm = new client.SecretsManager({
    region: 'ap-northeast-1',
    /*
    logger: console,
    maxAttempts: 3,
    retryMode: 'standard',
    requestTimeout: 100000,
    connectTimeout: 100000,
    */
  });

  // sm.middlewareStack.add(
  //   (next, context) => async (args) => {
  //     console.log("AWS SDK context", context.clientName, context.commandName);
  //     console.log("AWS SDK request input", args.input);
  //     const result = await next(args);
  //     console.log("AWS SDK request output:", result.output);
  //     return result;
  //   },
  //   {
  //     name: "MyMiddleware",
  //     step: "build",
  //     override: true,
  //   }
  // );

  const listSecrets = await sm.listSecrets();
  console.log(JSON.stringify(listSecrets.SecretList, undefined, 2));
}

main();

loganmzz avatar Dec 23 '24 09:12 loganmzz

I'm have the exact same issue. Cant replicate it reliably. Very random. Perhaps this is a DNS issue? Is there something in the SDK that could vary from the CLI tool?

garygreyling avatar Jun 03 '25 14:06 garygreyling

I'm have the exact same issue. Cant replicate it reliably. Very random. Perhaps this is a DNS issue? Is there something in the SDK that could vary from the CLI tool?

I managed to resolve the issue by disabling ipv6 system-wide. This issue was happening on a dev box, which exhibited issues when resolving IPV6 addresses generally.

garygreyling avatar Jun 03 '25 18:06 garygreyling

@loganmzz could you please try isabling ipv6 system-wide and see if that works? As I can't reproduce this issue, this issue is most likely not from sdk

zshzbh avatar Jul 10 '25 20:07 zshzbh

@zshzbh I was unable to reproduce :( But if you look at logs from original message you will see I get timeout for both ipv4 and ipv6 IPs

loganmzz avatar Jul 15 '25 11:07 loganmzz

@loganmzz, I am facing the same issue.

I have one observation: If i run the same code, manually inside the pod, the code seems to be working fine. But when I deploy pods, it seems to get the same error.

Environment:

  • Secret Manager Version: 3.716.0
  • Node.js Version: 20
  • AWS Region: ap-south-1

Did you find any possible fix?

RohanDoshi21 avatar Jul 22 '25 09:07 RohanDoshi21

@loganmzz, I am facing the same issue.

I have one observation: If i run the same code, manually inside the pod, the code seems to be working fine. But when I deploy pods, it seems to get the same error.

Environment:

  • Secret Manager Version: 3.716.0
  • Node.js Version: 20
  • AWS Region: ap-south-1

Did you find any possible fix?

@RohanDoshi21 Sorry I was unable to make it works when I had the issue. As suggested above, it sounds disabling ipv6 stack fixes the issue. As I'm not sure Docker is enabling ipv6 by default, it sounds a good assumption.

loganmzz avatar Jul 22 '25 11:07 loganmzz

Hey all - checking in from SDK team. Reviewing previous comments, I noted that the problem couldn't be replicated and someone suggested that turning off IPv6 fixed the issue. It seems the root cause isn't related to the SDK, but please reach out if you have evidence suggesting otherwise. I'm adding closing-soon label to this.

aBurmeseDev avatar Aug 06 '25 23:08 aBurmeseDev

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

github-actions[bot] avatar Aug 25 '25 00:08 github-actions[bot]