Add command to stop tasks
Ideally, when troubleshooting/working with services, having the ability to interact with tasks deployed under a service would be super helpful (operational type request).
Some examples of what I mean:
- There may be a case where I want to stop a task (or all tasks) under a service. It would be nice to have a flag to stop, or stop all
copilot svc stop-task --task-id 1234 [or --all]
- When reviewing logs, i may find that one task is not operating as expected. From a troubleshooting perspective, I don't want to see the entire log group for all tasks, just the one that is having problems. I can totally grep the output of the command, just looking to stay in copilot and not having to work around it.
copilot svc logs --task-id 1234
Hopefully this makes sense! :-)
hello @adamjkeller , in terms of svc logs with specific task, we've implemented that in https://github.com/aws/copilot-cli/pull/1334 and next week we'll do a release which should be included. As for stop-task the thing is for ECS service even if you stop a certain number of tasks as long as the desired count doesn't change ECS will spin up tasks until it gets to the desired number. So could you elaborate the use case then you want to stop a specific task of a service?
👍 to also add to @iamhopaul123's comment although there is no way atm with a command to stop all tasks, one way would be to set the count: 0 in the manifest and do a copilot deploy :)
Absolutely, I apologize that my request didn't have enough details.
Of course you can scale to zero, deploy, scale back up, and deploy again; but, that is not a great UX. When i'm running a service with 100 tasks, I don't want to stop all 100 if there is a problem with one (or many), I want to stop the one bad task, and allow the scheduler to bring up a replacement. The idea here is that I have the scheduler there to ensure my task count is at desired state within the service, but this is for the scenario where i want to kill one or many tasks under a service that aren't working as expected. Ideally, my health checks are perfect and it will automatically kill a task when it fails, but that isn't always the case.
Does this make sense?
There may also be times where a stop all makes sense, and rather than having to change count, deploy, change count, deploy (this equals minutes on the command line or minutes waiting for pipeline executions to complete), I just want to --stop-all and let the scheduler replace the tasks.
Gotcha! that totally makes sense, thanks for the clarification :D
Hello, When is this feature slated to be released?
Its such a waste of time to wait for failed tasks to complete trying to deploy endlessly.
If a task fails to deploy we would like to configure the number of times to deploy and then rollback to the previous known state.
Hello @vredcloud 👋🏼 ! Thank you for following up! Do you think these ⬇️ three issues are more related to your ask?
- #3061
- #2672
- #2608
This particular proposal is less concerned with stopping a failed deployment but more with stopping a few specific tasks in a running service. From your description, it seems to me that you'd like to configure circuit breaker retries - please correct my if I'm mistaken!
If my understanding ⬆️ is correct - unfortunately currently ECS does not provide a way to configure circuit breaker retries/timeout. If you don't mind, please give https://github.com/aws/containers-roadmap/issues/1247 a thumb-up!
Is there a way to scale down everything from the CLI to save costs overnight/between dev sessions? This seems like the closest, but like other commenters have said the UX to manually change count in the manifest isn't very user-friendly.
A suspend command or something that did this as a convenience would be helpful. Even better would be a teardown command to be able to destroy all resources in an env and be able to relaunch them instead of having to keep deleting and initialising everything like I'm doing at the moment.
Hi @boosh, Unfortunately, currently we do not have any other way than the ones mentioned above to stop the running tasks. But your use case and suggestions are really helpful for us to prioritize this feature. Thank you for your input; we will keep you posted when we prioritize this one.
Hello! We're going to be using Copliot extensively within our department which will mean many many developers waiting for retries to complete needlessly. I notice that this issue is still unassigned. Is anyone from the AWS Copilot team going to be assigned to this issue? We really need this to be pushed to the top of the priority list. Thanks!