embulk-output-s3 icon indicating copy to clipboard operation
embulk-output-s3 copied to clipboard

Support retry mechanism with an algorithm that uses exponential backoff.

Open giwa opened this issue 2 years ago • 0 comments

If there are many requests for upload to s3, 503 Slow Down happens. This is because the number of upload requests exceeds s3 auto-scaling. We need retry it.

Amazon S3 can return one of the following 5xx status errors:

AmazonS3Exception: Internal Error (Service: Amazon S3; Status Code: 500; Error Code: 500 Internal Error; Request ID: A4DBBEXAMPLE2C4D) AmazonS3Exception: Slow Down (Service: Amazon S3; Status Code: 503; Error Code: 503 Slow Down; Request ID: A4DBBEXAMPLE2C4D) The error code 500 Internal Error indicates that Amazon S3 can't handle the request at that time. The error code 503 Slow Down typically indicates that the number of requests to your S3 bucket is very high. For example, you can send 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix in an S3 bucket. However, in some cases, Amazon S3 can return a 503 Slow Down response if your requests exceed the amount of bandwidth available for cross-Region copying.

Resolution Use a retry mechanism in the application making requests Because of the distributed nature of Amazon S3, requests that return 500 or 503 errors can be retried. It's a best practice to build retry logic into applications that make requests to Amazon S3.

All AWS SDKs have a built-in retry mechanism with an algorithm that uses exponential backoff. This algorithm implements increasingly longer wait times between retries for consecutive error responses. Most exponential backoff algorithms use jitter (randomized delay) to prevent successive collisions. For more information, see Error retries and exponential backoff in AWS.

Ref: https://repost.aws/knowledge-center/http-5xx-errors-s3

giwa avatar Jan 06 '23 00:01 giwa