aws-toolkit-azure-devops icon indicating copy to clipboard operation
aws-toolkit-azure-devops copied to clipboard

Simultaneous DescribeEnvironment API calls lead to throttling during Elastic Beanstalk deployment

Open ian-a-anderson opened this issue 2 years ago • 2 comments

Describe the bug

When running a deployment to AWS Elastic Beanstalk environments, we are noticing failures due to throttling of the DescribeEnvironments API call. We only use the AWS Deploy to Elastic Beanstalk Azure DevOps task to run the deployment, and have witnessed this issue even when there are no other deployments running. Looking at the API call logs in CloudTrail, we can see that the task appears to somehow be firing multiple, simultaneous calls to this endpoint during the release (see screenshots for examples). If this happens in the initial call to validate the environment at the beginning of the release task execution, the deployment fails as the environment can't be validated. We have also seen multiple, overlapping calls occur during the task status polling while the task is executing. Although these latter errors don't cause the overall task to fail the deployment, they are still concerning as overlapping calls should not be triggered during status polling.

To reproduce

  1. Set up an Azure DevOps Release using the Deploy to Elastic Beanstalk Task
  2. Run a deployment using this task (our specific instance is a "Existing deployment bundle in S3" to deploy a large PHP application)
  3. Check the CloudTrail logs for DescribeEnvironment API calls, looking for calls that are executed at the exact same moment (and as a result hit ThrottlingException: Rate Exceeded error) EG: https://us-east-1.console.aws.amazon.com/cloudtrail/home?region=us-east-1#/events?EventName=DescribeEnvironments

Expected behavior

DescribeEnvironment API calls should not occur at the exact same moment in time in a single deployment. AWS Support has confirmed that there is an account limit of 2 calls per-second to this endpoint, and as a result any time multiple calls happen at the same instant, a ThrottlingException occurs. Note, it is understood that this can happen if multiple independent Azure DevOps (or other) deployments are running at the same time, as the throttling logic applies across the entire AWS Account. But we are seeing this happen in the context of a single deployment with no other deployments having been run for several minutes either before or after a throttled deployment.

Screenshots

  1. The DescribeEnvironments CloudTrail log for a failed deployment showing all DescribeEnvironment API calls across a 20-minute span. The AckerCLI user is the IAM account used to execute the Deploy to Elastic Beanstalk task and represents events generated during our deployment by the Deploy to Elastic Beanstalk task. Note the simultaneous calls, which resulted in ThrottlingException: Rate Limit Exceeded errors, causing the deployment to fail. 1 -Overlapping-Calls-Failed-Deployment

  2. Throttling exception error detail from the first AckerCLI event in the above log 2-Throttling-Failure-Detail

  3. Azure DevOps Task log showing the task execution failure that results when this throttling happens during the initial validate environment check for the deployment 3-DevOps-Task-Log-Failure

  4. DescribeEnvironments log for a successful deployment, showing throttling errors generated by simultaneous calls to the DescribeEnvironments API endpoint during task status polling. Note that most calls are spaced out 30-secs as per our Event Polling delay configuration... but sporadically multiple simultaneous calls occur which result in throttling. 4-Overlapping-Calls-During-Successful-Deployment-Status-Polling

  5. Azure DevOps Task log for the deployment shown in screenshot 4, showing the task reporting throttling and how the status api attempts to adjust polling to try to work around this issue. 5-DevOps-Task-Log-Success

Your Environment

  • Azure DevOps (Cloud-based) using Hosted Azure Agents (windows-2022, also seen on vs2017-win2016) to run the Release Tasks
  • Azure DevOps version: Version Dev18.M194.1 (AzureDevOps_M194_20211105.1)
  • AWS Toolkit for Azure DevOps version: 1.11.0 (latest version)
  • Task: Deploy to Elastic Beanstalk, S3 bucket deployment (PHP App), 30-second polling delay

ian-a-anderson avatar Nov 12 '21 14:11 ian-a-anderson

AWS Support has increased our account limit to 5 calls-per-second and we are still hitting this issue. Whatever is causing the multiple calls, it seems to be able to saturate this limit as well. Again, this happens when only one deployment is running.

ian-a-anderson avatar Dec 03 '21 14:12 ian-a-anderson

Not sure if it is related to #338, but linking here just for reference.

ian-a-anderson avatar Dec 03 '21 15:12 ian-a-anderson