aws-sdk-js-v3 icon indicating copy to clipboard operation
aws-sdk-js-v3 copied to clipboard

Client retry strategy documentation

Open farski opened this issue 3 years ago • 1 comments

Describe the issue with documentation

I have what I would think is a fairly unremarkable bit of code that sets some custom retry values for a v2 client:

const cloudwatch = new AWS.CloudWatch({
  apiVersion: '2010-08-01',
  maxRetries: 5,
  retryDelayOptions: { base: 1100 }
});

I have spent more time than I care to admit trying to create the equivalent of that in the v3 SDK. I see that the client configs implement a RetryInputConfig, which expects two values: maxAttempts and retryStrategy. maxAttempts seems pretty straightforward (though I am not 100% sure if it behaves differently than maxRetries. My current assumption is that maxAttempts = 1 + maxRetries, but who knows.)

retryStrategy takes a RetryStrategy, which takes two options: mode and retry. I haven't been able to find any documentation about what the possible options for mode are, so I haven't even tried to figure out what retry should be.

I did find this one Stack Overflow post that mentions StandardRetryStrategy, so then I started down that path.

The StandardRetryStrategy constructor takes a maxAttemptsProvider (how is that different than maxAttempts? Do I need both?), and some deciders. Based on the Stack Overflow post, I'm probably looking to use a delayDecider to replicate the original code. Is there any documentation that explains how a DelayDecider works? Not that I've found. Where does delayBase come from when that decider is being evaluated?

So I finally end up something like this:

const MAXIMUM_ATTEMPTS = 6;
const MAXIMUM_RETRY_DELAY = 10000;
const customRetryStrategy = new StandardRetryStrategy(
  async () => MAXIMUM_ATTEMPTS,
  {
    delayDecider: (_, attempts) =>
      Math.floor(
        Math.min(MAXIMUM_RETRY_DELAY, Math.random() * 2 ** attempts * 1100),
      ),
  },
);

const cloudwatchClient = new CloudWatchClient({
  apiVersion: '2010-08-01',
  maxAttempts: MAXIMUM_ATTEMPTS,
  retryStrategy: customRetryStrategy,
});

Does that preserve the original behavior? Is it even close? Who knows.

Setting basic configuration on clients shouldn't be this much work, or require reading 10 different source code files. This is a fairly common experience when upgrading code from v2 to v3. If what you're trying to do doesn't exist in the very minimal documentation, clear your calendar.

farski avatar May 16 '22 14:05 farski

Hello @farski ! After some iteration and code inspection I managed to add my own custom Retry strategy. It's very close to what you are proposing. I needed to override the strategy implementation mode value (this seems like a bug). Here is what worked:

import { StandardRetryStrategy } from '@aws-sdk/middleware-retry';

export const MAXIMUM_ATTEMPTS = 10;
const DELAY_RATIO = 1000;

export const standardRetryStrategy = new StandardRetryStrategy(
  () => Promise.resolve(MAXIMUM_ATTEMPTS),
  {
    delayDecider: (_delayBase, attempts) => {
      return DELAY_RATIO * attempts;
    },
  },
);

standardRetryStrategy.mode = 'STANDARD';

And on the client I wanted to override:

import {
  MAXIMUM_ATTEMPTS,
  standardRetryStrategy,
} from './standardRetryStrategy';
import {
  LambdaClient,
} from '@aws-sdk/client-lambda';

...
const client = new LambdaClient({
  maxAttempts: MAXIMUM_ATTEMPTS,
  retryStrategy: standardRetryStrategy,
});

Hope this helps :+1:

adelego avatar Aug 12 '22 14:08 adelego

Many months later and I also have similar concerns. On the old API we used to have maxRetries: 3, this no longer exists on the new API.

I can see we now have DEFAULT_MAX_ATTEMPTS = 3, which can be configured with maxAttempts when instantiating the client and also DEFAULT_MAX_RETRIES = 0, however this one isn't clear on where we can configure it or how it's used.

DEFAULT_MAX_ATTEMPTS reads:

The default value for how many HTTP requests an SDK should make for a single SDK operation invocation before giving up

Does this mean it will only retry successful invocations of the lambda? What happens if the target container failed to instantiate? Do those failures also get retried?

vieirai avatar Dec 19 '22 15:12 vieirai

A partial relevance to the topic. Existing "latest" documentation at https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Lambda.html#constructor-property refers to v2. So it has no retryStrategy at all.

The v3 docs are at https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/clients/client-lambda/classes/lambda.html#constructor.

PIoneer2 avatar Mar 29 '23 20:03 PIoneer2

Docs on how to customize attempts and delay (backoff) computation are here: https://github.com/awslabs/smithy-typescript/blob/main/packages/util-retry/README.md

One note: I have a PR open to fix a typo on that page, it should say @aws-sdk/client-s3 in the examples, not @smithy/client-s3.

kuhe avatar Jun 29 '23 19:06 kuhe

I think ConfiguredRetryStrategy can be used to replicate the old behavior

// v2 api
import AWS from 'aws-sdk'
const cloudwatch = new AWS.CloudWatch({
  apiVersion: '2010-08-01',
  maxRetries: 5,
  retryDelayOptions: { base: 1100 }
});

// v3 api
import { CloudWatch } from '@aws-sdk/client-cloudwatch'
import { ConfiguredRetryStrategy } from '@aws-sdk/util-retry'
const cloudwatch = new CloudWatch({
  apiVersion: '2010-08-01',
  retryStrategy: new ConfiguredRetryStrategy(6, 1100)
});

cdignam-segment avatar Aug 02 '23 19:08 cdignam-segment

@kuhe I think there's still a sizable gap between the v2 docs and v3 docs here.

For example, in the v2 docs, if you visit basically any service page, the Constructor Details will have all the information you would need to figure out how to configure the retry logic for a client, and it's in the same place as nearly all the other configuration details, so it's a very obvious and likely place for someone, particular a new SDK user, to look. It's very low friction.

Compare that to the v3 docs. The landing point for most people is still going to be a services page (e.g., from Googleing, "how do I retry S3 uploads with javascript"), but those pages don't make any obvious mention of this sort of configuration. There's no general information about the service's client or usage; it's just a long (long) list of operations, unless you happen to scroll all the way to the bottom where you finally come to a bit of S3Client Configuration information. This information, though, is not practically useful, and the issue I submitted originally is still true:

image

This tells you that there is some way of controlling retries, but is entirely unhelpful is actually how to accomplish that.

I don't think it's ideal to assume that someone who's trying to understand this for the first time is going to navigate their way to the Smithy Packages section of the SDK docs to find the util-retry docs. Lots of SDK users probably don't even know what Smithy is. The two most obvious places people are going to look for information about making their clients retry is on the client's page, or on the v2 migration page. The migration page does not point to any of the new docs, so it is still as unhelpful as it was over a year ago. And the client pages, as mentioned, is still a dead end of overly complicated, unhelpful technical details.

I think until a novice user could realistically find this information, this issue should remain open. The v3 SDK has a more complex solution to this problem than v2 had, but currently has less documentation. It's fine for the solution to be more complex, as it allows for more control and power, but the docs need to match or they become a huge barrier to new users.

farski avatar Aug 07 '23 12:08 farski

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

github-actions[bot] avatar Aug 22 '23 00:08 github-actions[bot]