copilot-cli icon indicating copy to clipboard operation
copilot-cli copied to clipboard

Sharing ECS "backend worker" task definition across two services with different entry commands

Open Offlein opened this issue 8 months ago • 7 comments

I have a "backend-service" type Copilot service that I use for my Laravel "queue worker" processes. I need to add in something now to handle regularly-scheduled tasks. Currently I'm configuring it to have a daily check for something that will enqueue long-running tasks for the queue worker[s] to take.

However, these both (and also my actual web-accessible application, but that's a much different workflow) use my identical Laravel app.

I'm thinking it's appropriate to have a "scheduled-tasks" Copilot service that is completely identical to my existing "worker" Copilot service, except that my worker will have a manifest line like this:

command: ["su", "webuser", "-c", "php artisan queue:listen --timeout=1200 --queue=default"]

And the scheduled tasks service would have one with a slightly different PHP command:

command: ["su", "webuser", "-c", "php artisan schedule:work"]

I could simply create a new Copilot service and replicate the manifest. I think it'd also be possible to use the same manifest file but override the command... But in both of those cases I think it will generate a completely new ECS "Task Definition" -- assuming I got the terminology right -- when it actually is maybe possible to share the same Task Definition, and just run two with different entry commands.

Am I way off on this? Thanks so much!

Offlein avatar Dec 04 '23 22:12 Offlein

hey @Offlein!

But in both of those cases I think it will generate a completely new ECS "Task Definition"

I think you are mostly right about this - the task definition would be almost identical for the worker and the scheduled task. The only few exception that I can think of:

  1. If your worker service's publish field is configured, then its task role will need additional permissions to publish topics, which means you necessarily need two different task def, one pointing to the worker service's task role (with the publish permission), and the other to the schedule tasks' task role (without the publish permission). Otherwise, you could have the same task role for both workloads, and thus the same task definition.
  2. For the worker service, likely you have exec: true; whereas you don't have the option to configure exec: true in your scheduled job manifest. This would also result in a different task definition.

In general, I think you are right that they could share the same task definition if you neither publish topics nor need to exec into your worker service.

The ECS task definitions are free of charge though, i.e. even if you have two identical task definitions, you won't pay extra. You pay for the memory/CPU usage of the actual tasks running. Can you help me understand why you'd like to share the task definitions? Or, rather than task definitions, are you looking to share the manifest between services (i.e. the scheduled job and the worker service would use the same manifest.yml)?

Lou1415926 avatar Dec 06 '23 20:12 Lou1415926

@Lou1415926 Thanks so much for the reply.

If your worker service's publish field is configured, ... Nope, luckily not here!

For the worker service, likely you have exec: true; whereas you don't have the option to configure exec: true in your scheduled job manifest. This would also result in a different task definition. Sorry, I actually may not understand this. I DO have exec: true in my worker manifest... And I was planning to have that true in the scheduled-job manifest (by virtue of the two being identical except for the command: line), but maybe that's a mistake? IIRC "exec" is a thing that lets me access a shell within the service; is that right? I have not actually used it but it feels as though the use-case would be similar enough that I'd want it [or not want it] for both.

Can you help me understand why you'd like to share the task definitions? Or, rather than task definitions, are you looking to share the manifest between services (i.e. the scheduled job and the worker service would use the same manifest.yml)?

This is definitely part of it, yes!

I am not sure if it matters so much to be, in all honesty. One thing that feels "wrong" about it was actually about how deploys would work.

In my Github Action it takes 7-10 minutes to deploy our "worker" service (as well as the App Runner web service), although they do run in parallel. It just feels "wasteful" to run a full deploy that runs a whole bunch of processing to do something a second time that is already 98% of the way there. There's a monetary cost (I assume?) with our CI process for that (although it's not a major concern).

Moreso, now that I think about it maybe, I believe Copilot will create a ton of boilerplate things within our AWS account that are specific to the new "scheduled-task" service which, really, will be identical to the "worker" service, right? Like user roles, CloudWatch logs, things like that.

Not the end of the world, I guess. But it feels like it'd be preferable to have the command: line parameterized and somehow, when the copilot deploy occurs, it would create one running task with Parameter_A and another with Parameter_B somehow, indicating which initial command to run.

Offlein avatar Dec 06 '23 21:12 Offlein

IIRC "exec" is a thing that lets me access a shell within the service; is that right?

That is correct - you can run copilot svc exec --command to execute one command into the service, or copilot svc exec to start a shell session inside of the service.

And I was planning to have that true in the scheduled-job manifest (by virtue of the two being identical except for the command: line), but maybe that's a mistake?

The only problem is that exec does not exist in a scheduled job's manifest : ( The scheduled jobs are one-off tasks that go away once they finish the job, so you would have a limited period of time to be able to exec into it anyway 😞

It just feels "wasteful" to run a full deploy that runs a whole bunch of processing to do something a second time that is already 98% of the way there.

Got it! I suspect that the majority of the 7-10 minutes spent on deploying the worker service is on the ECS Service itself 💭 Screenshot 2023-12-07 at 3 06 11 PM Screenshot 2023-12-07 at 3 06 41 PM

Looking at the leftmost column - these peripheral resources are typically super quick (~10 seconds) to create/update. This means that even if Copilot reuses the same task definition, roles, log groups, etc., I suspect that the marginal benefit would probably be quite small.

There are other resources that do take time, like the custom resource EnvController, which cannot be shared by workloads : (.

Lou1415926 avatar Dec 07 '23 23:12 Lou1415926

@Lou1415926 Thank you again.

One point of confusion is that I'm actually not creating a Scheduled Job's manifest. I'm adding another "backend-service" service. It would run permanently and handle its own orchestration of my tasks throughout the day, as opposed to using the Copilot "scheduled job" service.

But it sounds like either way I still can't get it all the way where I want (right?) so maybe I'll just bite the bullet and copy the manifest!

Offlein avatar Dec 07 '23 23:12 Offlein

ah got it! Do you think https://github.com/aws/copilot-cli/issues/4122 would have addressed your concern?

Lou1415926 avatar Dec 07 '23 23:12 Lou1415926

That would be #2699 actually right? I think #2699 would have addressed the concern, yes. (I have thumbed it up!) :slightly_smiling_face:

Offlein avatar Dec 08 '23 00:12 Offlein

👍🏼 to this and #2699 - more than anything it's just very unDRY to have a bunch of manifests that are almost identical

ssyberg avatar Jan 24 '24 15:01 ssyberg

This issue is stale because it has been open 60 days with no response activity. Remove the stale label, add a comment, or this will be closed in 14 days.

github-actions[bot] avatar Mar 25 '24 00:03 github-actions[bot]

This issue is closed due to inactivity. Feel free to reopen the issue if you have any further questions!

github-actions[bot] avatar Apr 09 '24 00:04 github-actions[bot]

Can we keep this open? It's still a pretty brittle setup, we now have 4 manifests that all need to be maintained identically and it's leading to issues!

ssyberg avatar Apr 16 '24 10:04 ssyberg