ecs-deploy icon indicating copy to clipboard operation
ecs-deploy copied to clipboard

Deployment takes a tonne load of time to complete now

Open damey2011 opened this issue 3 years ago • 2 comments

Deployment used to take 3-4 minutes to complete before, along side the container count checks, but they now take around 15 minutes, because some containers do not terminate quickly. The deployment command used here is

ecs-deploy -v -r $AWS_DEFAULT_REGION -c $ECS_CLUSTER_NAME -n $ECS_SERVICE_NAME -i ${ECR_REPO_URL}:${CI_COMMIT_SHORT_SHA} --max-definitions 10 -t 900 --use-latest-task-def --run-task --wait-for-success -p ${CI_ENVIRONMENT_NAME}

And running the deployment in verbose mode, figured out that ecs-deploy uses the container count to determine if the deployment was complete successfully before it returns a success message, otherwise rolls back. Our timeout used to be 300 seconds, but now we have to start making use of 900 since 5minutes is no longer sufficient to complete deployment. I would understand that this is from AWS as maybe the way of shutting down old containers have changed, and the new way takes longer, but is it possible to review the strategy the library uses in checking for container deployment success.

A typical output at intervals of checking the deployment state is:

++ jq '[.services[].deployments[]] | length'
+ NUM_DEPLOYMENTS=2
+ '[' 2 -eq 1 ']'
+ sleep 2
+ i=126
+ '[' 126 -lt 900 ']'

The old container doesn't get killed as fast as it used to be anymore.

damey2011 avatar May 25 '21 13:05 damey2011

Hi @damey2011,

Unfortunately I'm facing the same problem in my pipelines. Has anyone had any progress so far, through an alternative to having a reliable callback from our deploy?

Thanks in advance

algarves avatar May 19 '23 19:05 algarves

@algarves Unfortunately, we don't even use this anymore so I can't say. We use EKS now.

damey2011 avatar Jul 18 '23 12:07 damey2011