containers-roadmap
containers-roadmap copied to clipboard
[ECS] [RunTask]: logConfiguration Override
Tell us about your request
Extend the overrides
parameter of RunTask and StartTask with logConfiguration.
"overrides": {
"containerOverrides": [{
"command": ["string"],
"cpu": number,
"environment": [{
"name": "string",
"value": "string"
}],
"memory": number,
"memoryReservation": number,
"name": "string",
"logConfiguration" {
logDriver: 'configurable',
options: {
'awslogs-group': 'configurable',
'awslogs-region': 'configurable',
'awslogs-stream-prefix': 'configurable'
}
}
}],
"executionRoleArn": "string",
"taskRoleArn": "string"
},
Which service(s) is this request for? ECS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? When running one-off tasks, on a copy a Taskdefinition of an existing running services. The logs end up in Cloudwatch in the same Loggroup as the live service. This makes it hard to lookup the logs in Cloudwatch.
I'm also using RunTask for creating Cloudwatch cron events which trigger RunTask through a lambda. Having a seperate Cloudwatch Loggroup would really make it easier to have the logs of those cron-tasks in a seperate group.
Are you currently working around this issue? Search for keyword in cloudwatch
Additional context Great work on the roadmap!
Any update on this? Is this possible today?
👍 for this suggestion
This would be extremely useful
This is partly possible with FireLens.
FireLens takes the options in your logConfiguration options and sends them directly to Fluentd/Fluent Bit. You can use the fact that Fluent Bit and Fluentd supports using environment variables as configuration to enable overrides.
Here's the key sections of an example task definition:
{
"family": "firelens-overrides",
"containerDefinitions": [
{
"essential": true,
"image": "906394416424.dkr.ecr.ap-south-1.amazonaws.com/aws-for-fluent-bit:latest",
"name": "log_router",
"firelensConfiguration": {
"type": "fluentbit"
},
"environment": [
{ "name": "NAME", "value": "cloudwatch" },
{ "name": "REGION", "value": "ap-south-1" },
{ "name": "LOG_STREAM", "value": "test" },
{ "name": "LOG_GROUP", "value": "env_var_interpolation_example" }
]
},
{
"essential": true,
"image": "1111111111111.dkr.ecr.ap-south-1.amazonaws.com/app-image:latest",
"name": "app",
"logConfiguration": {
"logDriver":"awsfirelens",
"options": {
"Name": "${NAME}",
"region": "${REGION}",
"log_group_name": "${LOG_GROUP}",
"auto_create_group": "true",
"log_stream_name": "${LOG_STREAM}"
}
}
}
]
}
As you can see, the Log Group, Region, and Log Stream are all set via environment variables on the log router container. Environment variables can be overridden in runtask, so you can re-use this task def multiple times and change some of the log parameters each time you run it.
+1 for this one :)
This is partly possible with FireLens.
FireLens takes the options in your logConfiguration options and sends them directly to Fluentd/Fluent Bit. You can use the fact that Fluent Bit and Fluentd supports using environment variables as configuration to enable overrides.
Here's the key sections of an example task definition:
{ "family": "firelens-overrides", "containerDefinitions": [ { "essential": true, "image": "906394416424.dkr.ecr.ap-south-1.amazonaws.com/aws-for-fluent-bit:latest", "name": "log_router", "firelensConfiguration": { "type": "fluentbit" }, "environment": [ { "name": "NAME", "value": "cloudwatch" }, { "name": "REGION", "value": "ap-south-1" }, { "name": "LOG_STREAM", "value": "test" }, { "name": "LOG_GROUP", "value": "env_var_interpolation_example" } ] }, { "essential": true, "image": "1111111111111.dkr.ecr.ap-south-1.amazonaws.com/app-image:latest", "name": "app", "logConfiguration": { "logDriver":"awsfirelens", "options": { "Name": "${NAME}", "region": "${REGION}", "log_group_name": "${LOG_GROUP}", "auto_create_group": "true", "log_stream_name": "${LOG_STREAM}" } } } ] }
As you can see, the Log Group, Region, and Log Stream are all set via environment variables on the log router container. Environment variables can be overridden in runtask, so you can re-use this task def multiple times and change some of the log parameters each time you run it.
Thank you @PettitWesley for this solution. It is working well with ECS tasks with launchtype EC2 but no log is created in Cloudwatch when launching via Fargate. Seems like it is due to "networkMode": "awsvpc" needed by Fargate. Because when I launch a task with launchtype EC2 and "networkMode": "awsvpc" it is also not working... Do you have any idea? I checked and it has nothing to do with IAM rights.
The post from @etiennecaldichoury is very helpful (thanks!) and I'm eager to implement this on one of my project. However, my use case also requires Fargate and I see that might be a blocker.
Anyone else on this thread have info on whether this is compatible with Fargate or any special steps to enable?
From this AWS Firelens docs page, I compiled the following excerpts:
-
Fargate is supported:
FireLens for Amazon ECS is supported for tasks using both the Fargate and EC2 launch types.
-
Don't specify TCP forward input:
In your custom configuration file, for tasks using the bridge or awsvpc network mode, you should not set a Fluentd or Fluent Bit forward input over TCP because FireLens will add it to the input configuration.
-
Task Execution IAM role required if using ECR or Secrets Manager:
If your task uses the Fargate launch type and you are pulling container images from Amazon ECR or referencing sensitive data from AWS Secrets Manager in your log configuration, then you must include the task execution IAM role.
-
Firelense config files on S3 not supported with Fargate:
For tasks using the Fargate launch type, the only supported config-file-type value is file.
@etiennecaldichoury - Of these restrictions, the only possibility I can see that might apply to your case would be the third item regarding the need for a task execution role - which governs permissions/access on setting up the containers before the policy specified by TaskRole takes over. Can you confirm and let us know if this might be the problem?
Thanks!
Agree with @aaronsteers- this approach should work on both EC2 and Fargate; nothing about it is EC2 specific AFAIK.
Hi @aaronsteers @PettitWesley
Thanks a lot for your very quick reply! Here are some more information (sorry for the long post :) )
Concerning your points n°2 and 4: I don’t use any configuration file, configuration is done via task definition below (I have removed/hidden critical information).
"containerDefinitions": [
{
"essential": true,
"image": "906394416424.dkr.ecr.eu-central-1.amazonaws.com/aws-for-fluent-bit:latest",
"name": "log_router",
"firelensConfiguration": {
"type": "fluentbit"
},
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "<awslogs-group>",
"awslogs-region": "eu-central-1",
"awslogs-stream-prefix": "log_router"
}
},
"environment": [
{
"name": "LOG_PREFIX",
"value": "ecs"
}
]
},
{
"logConfiguration": {
"logDriver": "awsfirelens",
"options": {
"Name": "cloudwatch",
"region": "eu-central-1",
"log_group_name": "<log_group_name>",
"log_stream_prefix": "${LOG_PREFIX}/",
"log_key": "log"
}
},
"portMappings": [
{
"protocol": "tcp",
"containerPort": 5000
}
],
"image": "<id>.dkr.ecr.eu-central-1.amazonaws.com/<container>:<tag>",
"essential": true,
"name": "application"
}
],
"family": "<family>",
"executionRoleArn": "arn:aws:iam::<id>:role/ecsTaskExecutionRole",
"cpu": "256",
"memory": "1024",
"networkMode": "awsvpc",
"requiresCompatibilities": [
"FARGATE"
]
}
Concerning n°3, normally IAM rights are fine because when I switch back my container to « awslogs » driver (instead of « awsfirelens »), it is writing logs correctly to cloudwatch! Moreover, my EC2 and Fargate task definitions are sharing the same task execution IAM role and one is working and the other no.
Seems like issue is coming from network mode "awsvpc". Please find below "log_router" container logs
With EC2 task ("application" container log is written to Cloudwatch)
tput: No value for $TERM and no -T specified tput: No value for $TERM and no -T specified AWS for Fluent Bit Container Image Version 2.2.0 tput: No value for $TERM and no -T specified [1mFluent Bit v1.3.9[0m [1m[93mCopyright (C) Treasure Data[0m [2020/03/10 07:11:29] [ info] [storage] version=1.0.1, initializing... [2020/03/10 07:11:29] [ info] [storage] in-memory [2020/03/10 07:11:29] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128 [2020/03/10 07:11:29] [ info] [engine] started (pid=1) [2020/03/10 07:11:29] [ info] [in_fw] listening on unix:///var/run/fluent.sock time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_group = 'pys-flask-dev-log'\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_stream_prefix = 'ecs/'\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_stream = ''\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter region = 'eu-central-1'\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_key = 'log'\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter role_arn = ''\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter auto_create_group = 'false'\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter endpoint = ''\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter credentials_endpoint = \n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_format = ''\n" [2020/03/10 07:11:29] [ info] [in_fw] binding 0.0.0.0:24224 [2020/03/10 07:11:29] [ info] [in_tcp] binding 127.0.0.1:8877 [2020/03/10 07:11:29] [ info] [sp] stream processor started [engine] caught signal (SIGTERM) [2020/03/10 07:11:48] [ info] [input] pausing forward.0 [2020/03/10 07:11:48] [ info] [input] pausing forward.1 [2020/03/10 07:11:48] [ info] [input] pausing tcp.2 [2020/03/10 07:12:08] [ warn] [engine] service will stop in 5 seconds [2020/03/10 07:12:12] [ info] [engine] service stopped [2020/03/10 07:12:12] [ info] [input] pausing forward.0 [2020/03/10 07:12:12] [ info] [input] pausing forward.1 [2020/03/10 07:12:12] [ info] [input] pausing tcp.2
With EC2 task and networkMode "awsvpc" ("application" container log is NOT written to Cloudwatch)
tput: No value for $TERM and no -T specified tput: No value for $TERM and no -T specified AWS for Fluent Bit Container Image Version 2.2.0 tput: No value for $TERM and no -T specified [1mFluent Bit v1.3.9[0m [1m[93mCopyright (C) Treasure Data[0m [2020/03/10 07:11:29] [ info] [storage] version=1.0.1, initializing... [2020/03/10 07:11:29] [ info] [storage] in-memory [2020/03/10 07:11:29] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128 [2020/03/10 07:11:29] [ info] [engine] started (pid=1) [2020/03/10 07:11:29] [ info] [in_fw] listening on unix:///var/run/fluent.sock time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_group = 'pys-flask-dev-log'\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_stream_prefix = 'ecs/'\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_stream = ''\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter region = 'eu-central-1'\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_key = 'log'\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter role_arn = ''\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter auto_create_group = 'false'\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter endpoint = ''\n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter credentials_endpoint = \n" time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_format = ''\n" [2020/03/10 07:11:29] [ info] [in_fw] binding 0.0.0.0:24224 [2020/03/10 07:11:29] [ info] [in_tcp] binding 127.0.0.1:8877 [2020/03/10 07:11:29] [ info] [sp] stream processor started [engine] caught signal (SIGTERM) [2020/03/10 07:11:48] [ info] [input] pausing forward.0 [2020/03/10 07:11:48] [ info] [input] pausing forward.1 [2020/03/10 07:11:48] [ info] [input] pausing tcp.2 [2020/03/10 07:12:08] [ warn] [engine] service will stop in 5 seconds [2020/03/10 07:12:12] [ info] [engine] service stopped [2020/03/10 07:12:12] [ info] [input] pausing forward.0 [2020/03/10 07:12:12] [ info] [input] pausing forward.1 [2020/03/10 07:12:12] [ info] [input] pausing tcp.2
The only difference seems to be: [in_fw] binding 0.0.0.0:24224 -> [in_fw] binding 127.0.0.1:24224 Seems like it is the HOST injected by ECS that changes? Seems normal.
With Fargate task ("application" container log is also NOT written to Cloudwatch)
tput: No value for $TERM and no -T specified tput: No value for $TERM and no -T specified AWS for Fluent Bit Container Image Version 2.2.0 tput: No value for $TERM and no -T specified [1mFluent Bit v1.3.9[0m [1m[93mCopyright (C) Treasure Data[0m time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter log_group = 'pys-flask-dev-log'\n" time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter log_stream_prefix = 'ecs/'\n" time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter log_stream = ''\n" time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter region = 'eu-central-1'\n" time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter log_key = 'log'\n" time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter role_arn = ''\n" time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter auto_create_group = 'false'\n" time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter endpoint = ''\n" time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter credentials_endpoint = \n" time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter log_format = ''\n" [2020/03/10 07:20:37] [ info] [storage] version=1.0.1, initializing... [2020/03/10 07:20:37] [ info] [storage] in-memory [2020/03/10 07:20:37] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128 [2020/03/10 07:20:37] [ info] [engine] started (pid=1) [2020/03/10 07:20:37] [ info] [in_fw] listening on unix:///var/run/fluent.sock [2020/03/10 07:20:37] [ info] [in_fw] binding 127.0.0.1:24224 [2020/03/10 07:20:37] [ info] [in_tcp] binding 127.0.0.1:8877 [2020/03/10 07:20:37] [ info] [sp] stream processor started [engine] caught signal (SIGTERM) [2020/03/10 07:21:10] [ info] [input] pausing forward.0 [2020/03/10 07:21:10] [ info] [input] pausing forward.1 [2020/03/10 07:21:10] [ info] [input] pausing tcp.2 time="2020-03-10T07:21:11Z" level=error msg="[cloudwatch 0] NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\n" [2020/03/10 07:21:11] [ warn] [engine] service will stop in 5 seconds [2020/03/10 07:21:11] [ warn] [engine] failed to flush chunk '1-1583824870.364332508.flb', retry in 11 seconds: task_id=0, input=forward.0 > output=cloudwatch.1 [2020/03/10 07:21:15] [ info] [engine] service stopped [2020/03/10 07:21:15] [ info] [input] pausing forward.0 [2020/03/10 07:21:15] [ info] [input] pausing forward.1 [2020/03/10 07:21:15] [ info] [input] pausing tcp.2
Got this error message:
time="2020-03-10T07:21:11Z" level=error msg="[cloudwatch 0] NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\n"
I really don’t understand what is happening.
@etiennecaldichoury The old awslogs driver uses the Task Execution Role (which is for stuff managed by us) and FireLens uses the Task Role (which is for the containers in your task- and the Firelens side-car is in your task).
I don't see a Task Role in your Task Definition; I suspect that's the issue.
@PettitWesley thanks a lot I'll try today
@PettitWesley working perfectly...! Now I know "Task Execution" and "Task" roles are different things :D Thanks a lot again
Closing this issue with the FireLens solution provided. Let us know if that solution is not valid by reopening the issue
@srrengar Even though the FireLens solution works, it would be more convenient to override logConfiguration variables in the task level. It would make a lot easier to have Scheduled Tasks route log traffic to a separate log group or prefix in Cloudwatch Logs, if necessary. The main reason I didn't like the FireLens solution is because the log message in Cloudwatch Logs is not clean with simple log output, it has a full json string for every stdout message.
{
"container_id": "xxxxxx",
"container_name": "/ecs-xxxx-1-xxxx-d2d2f1a6b5ca8abd1900",
"ec2_instance_id": "i-xxxx",
"ecs_cluster": "xxxx",
"ecs_task_arn": "arn:aws:ecs:us-east-1:xxxx:task/xxxx",
"ecs_task_definition": "xxxx:1",
"log": "________Log message here________",
"source": "stdout"
}
@lafraia You can make FireLens give you the simple log output. Add one extra option in your logConfiguration- log_key log
. Then it will just send the value of the log
key.
https://github.com/aws/amazon-cloudwatch-logs-for-fluent-bit#plugin-options
i don't like this solution and would much rather see a way to override the configuration on the existing task. i don't want to introduce yet another aws service into the mix just to include a "batch id" as part of the logging group for jobs so i can easily identify them.
could this be reconsidered? an implementation of #360 would work too!
Me too. I would like to be able to set logConfiguration
on containerOverrides
.
Please reopen. This is not fixed; the Firelens thing is a workaround.
In my case, I've got a single task definition that is used by multiple scheduled tasks, with different args. It sure would be nice to be able to tell those apart.
Please reopen!!! This is not fixed; the Firelens thing is a workaround.
Lazy AWS team closing issues because there is a workaround
As mentioned by previous visitors to this topic, the provided workaround is not a solution. Having to install and configure an entirely different logging engine to get deterministic/customisable-on-invoke logging settings should not be an expectation.
@nathanpeck @tabern @maishsk Apologies for the random ping, but I noticed that the original closer of this topic does not appear to be active on GitHub. Would you consider reopening this?
Please re-open. The ability to override container command, cpu/memory but not the logging just turns this into a maintenance nightmare, with some configuration set at RunTask and some set in the task-definition.
Could you please reopen this proposed feature? This would boost by a lot the workflow using tools like AWS Step Functions or similar tools. In that way you could dynamically decide the log group name and stream when running a task multiple times that share the same container / Task Definition. Right now instead you need to create multiple Task Definitions only for this reason, making the rest of Run Task parametrization useless. Thank you very much!
i would also like to see this re-opened. i use ecs for batch processing and i would like to include the batch id in the log path for easier identification. the firelens
solution is just overly complicated and shouldn't be necessary.
+1 this would be very useful for my use case of debugging individual Fargate tasks that normally have CloudWatch Logs disabled in the task definition. We would enable logs on a per-task basis when administrator starts tasks at runtime and needs to check the log output.
Adding myself to the chorus of people who want this reopened, I keep having to make identical tasks that only differ in the log configuration.
+1 for this. When running different tasks from the same image, the prefix is pointless without being able to override.
+1 can we please add this logConfiguration Override, it's important for us
Please re-open +1