Issues icon indicating copy to clipboard operation
Issues copied to clipboard

ECS Step Template - Unable to specify FirelensConfiguration in Container Definition

Open FinnianDempsey opened this issue 2 years ago • 8 comments

Team

  • [X] I've assigned a team label to this issue

Severity

Blocking multiple customers - Unable to use firelens log driver for ecs

Version

2022.4.1372-hotfix.3716

Latest Version

I could reproduce the problem in the latest build

What happened?

When configuring the awsfirelens log driver for the container definition using the Deploy Amazon ECS Service template, CloudFormation will fail the deployment with an error: When awsfirelens log driver is specified in log configuration, a firelens configuration object must be configured in one of the containers

There isn't any way to configure the required FirelensConfiguration object in the container definition in order to use that log driver.

Reproduction

Configure a Deploy Amazon ECS Service step template Under the Container Definition, specify the Log Driver to use awsfirelens Deploy a Release and see error in CloudFormation

Error and Stacktrace

Resource handler returned message: "Invalid request provided: Create TaskDefinition: When awsfirelens log driver is specified in log configuration, a firelens configuration object must be configured in one of the containers. (Service: AmazonECS; Status Code: 400; Error Code: ClientException; Request ID: XXXXXXX; Proxy: null)" (RequestToken: XXXXXXX, HandlerErrorCode: InvalidRequest)

More Information

Internal Link - Discourse Internal Link - Zendesk Internal Link - Slack Internal Link - Uservoice

Workaround

Currently the only way to configure FirelensConfiguration is to use the CloudFormation Step Template with the CloudFormation template being sourced from a package.

See this forum post for more info

FinnianDempsey avatar Oct 05 '22 05:10 FinnianDempsey

Hi @FinnianDempsey, do you have any news regarding this issue? Thanks in advance!

esaporski avatar Oct 24 '22 03:10 esaporski

Hey @esaporski, no worries!

It looks like a solution is definitely being worked on however I can't give an exact timeframe of when it will be released. There are a few things to iron out to get it implemented but since it will be pushed as an update to the Step Template and not require a new build of Octopus it shouldn't be too far away!

Screen Shot 2022-10-24 at 15 19 44

FinnianDempsey avatar Oct 24 '22 05:10 FinnianDempsey

Hi @FinnianDempsey, sorry to bother you again but do you thing you have a more precise ETA on that issue? My manager is asking me to implement Graylog with Fluentd/Fluentbit on various ECS Services and that would be a lot of work if I need to implement using the CloudFormation template. Thanks again!

esaporski avatar Nov 09 '22 18:11 esaporski

Hi @esaporski, I'm the product manager for the team responsible for the maintenance of the ECS steps.

We've had a look into this and are considering a small change to enable this support. The changes we are considering would mean:

  • Users can configure the firelensConfiguration section for a container, allowing them to configure a log_router container manually.
  • The user will be responsible for ensuring the container has an appropriate image configured but will have full control over the container.

This would be a reasonably flexible approach but may introduce new opportunities for misconfiguration, some of which we're not sure can be detected with our current validation mechanisms. i.e. we might not be able to warn the user if they configure any of these scenarios:

  • Configuring a container to use awsfirelens and not configuring a corresponding log_router container with a firelensConfiguration.
  • Configuring multiple containers with firelensConfiguration (This might be valid, but it wouldn't surprise me if it weren't).
  • Configuring a container with firelensConfiguration but not providing an appropriate fluentd or fluentbit container image.
  • Any other container configuration that might be incompatible with the firelensConfiguration

We could certainly add plenty of note text throughout the step in appropriate locations to help point the user in the appropriate direction. Still, the user would need to set up the extra container themselves.

We are unlikely to be able to prioritise a complete solution, but this should unblock the scenario.

Please let me know if this proposal is sufficient to meet your needs. This change is dependent on your feedback. Thanks so much for your feedback so far!

rhysparry avatar Nov 09 '22 23:11 rhysparry

Hi @rhysparry, thanks for the detailed response :) The only thing I'm not sure is if you can configure multiple containers with firelensConfiguration. I know that if you need to send logs to multiple locations, you can create a single log_router container to do that with fluentd or fluentbit like this.

But yeah, that proposal is sufficient to meet our needs :+1: The only parameter that we cannot change currently using the Step Template is the firelensConfiguration block. And that's what we need to set to create the log_router container:

{
   "containerDefinitions":[
      {
         "essential":true,
         "image":"aws_account_id.dkr.ecr.region.amazonaws.com/custom-fluent-bit:latest",
         "name":"log_router",
         "firelensConfiguration":{
            "type":"fluentbit",
            "options":{
               "config-file-type":"file",
               "config-file-value":"/logDestinations.conf"
            }
         }
      }
   ]
}

Link to the AWS documentation on FirelensConfiguration: https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_FirelensConfiguration.html

Thanks for your attention!

esaporski avatar Nov 10 '22 13:11 esaporski

Hi @esaporski, thanks for your response. I'll work with the team to try to slot this in early in our upcoming cycle, which starts the week after next.

rhysparry avatar Nov 11 '22 05:11 rhysparry

Hi @rhysparry, sorry for bothering again, but did you have time to work on that fix? Is there an ETA for when it will be ready? Thanks again!

esaporski avatar Nov 22 '22 20:11 esaporski

Hi @esaporski, we're just starting the current work cycle, and this change is part of that plan. Our team's cycle lasts five weeks, so it should be completed in that time. Also, because the ECS step is developed on our new step package framework, we will be able to get it out faster when we do make the change.

Sorry for the delay so far. I'll be advocating to get this sorted out as soon as possible.

rhysparry avatar Nov 22 '22 21:11 rhysparry

Hi @rhysparry, any news regarding that issue? Thanks in advance!

esaporski avatar Dec 14 '22 17:12 esaporski

Hi @esaporski, I have some good news. I just finished a call with our engineer who is working on this change, and he's starting on it now. It shouldn't be long now.

rhysparry avatar Dec 15 '22 05:12 rhysparry

Hi @rhysparry, I am part of the team who is working with this ECS Step Template. I was wondering if you guys have any updates on this issue. Sorry for insisting so much but we have some teams waiting to see their logs on our centralized logging platform so we need to decide soon if we are going to wait for this issue to be solved or think about a workaround.

Thanks in advance!

nspencerh avatar Dec 28 '22 13:12 nspencerh

Hi @nspencerh, thanks for your message. We were shut down between Christmas and New Year and have just started our work for the year. Unfortunately, the engineer working on the fix fell ill during the break but expects to return sometime this week. I'll be able to give a better update on the progress when he returns.

A workaround you may wish to consider is to export the CloudFormation template and incorporate that in a "Deploy an AWS CloudFormation template" step after manually making the change to add the FireLensConfiguration. You can then disable the ECS step until the fix is available.

I will provide another update soon. In the meantime, thanks for your patience.

rhysparry avatar Jan 02 '23 23:01 rhysparry

Hi @esaporski and @nspencerh. I just wanted to provide a quick update on this issue. The engineer has developed a fix, and it is proceeding through our internal review process now. Hopefully, not too much longer.

rhysparry avatar Jan 11 '23 05:01 rhysparry

Hi @rhysparry, thank you for the update!

We'll be anxiously waiting for this fix to be deployed. Let us know when this is ready to test it.

Thank you very much!

nspencerh avatar Jan 11 '23 14:01 nspencerh

Hi @nspencerh @esaporski, thanks for your patience whilst we make this fix. Just wanted to let you know that the fix for this issue is now available in a new v2 of the Deploy Amazon ECS Service step.

This step is built using our new step package framework, and so the new version of the step will automatically be downloaded to your Octopus instance without requiring a Server upgrade if you have the Step Template Updates feature enabled. You can check if this is enabled by going to Configuration -> Features -> Step Template Updates. The default for this feature is enabled so it is likely that you will receive the new step version within the next few hours (the updates are checked every hour or so). If this feature is not enabled the new v2 step will be bundled inside a new Octopus Server version at a later time - I'll update here with the version that this is available when this becomes known.

As this fix is being released in a new v2 of the step, you will need to upgrade the version of any Deploy Amazon ECS Service steps you are using within your deployment processes in order to be able to enable FireLens configuration for a container.

The process for this is easy to do, when you go to the step within your deployment process a banner will appear at the top of the step configuration prompting you to perform the upgrade:

image

This upgrade is generally quick, after which you can save your deployment process to complete the upgrade to v2:

image

Once the step is upgraded to v2 you will be able to enable FireLens configuration on a container as required, for example:

image

Please let us know if you encounter any issues with this new version of the step.

geofflamrock avatar Jan 18 '23 01:01 geofflamrock

I just got the chance* to test it and it works :+1: Thank you!

esaporski avatar Jan 19 '23 13:01 esaporski

Great to hear @esaporski 🎉 I'll close this issue now, but do please let us know if you encounter any further issues.

geofflamrock avatar Jan 19 '23 21:01 geofflamrock

It's worth clarifying the above comment about the release of this fix.

The version of the Deploy Amazon ECS Service step that fixes this issue is 2.0.1. This step version is already available to any Octopus Server instance from 2022.1 onwards that has Step Template Updates enabled as it is a newer step built on our new framework that supports releasing new versions independent of a Octopus Server version. Our documentation on Automatic Step Template Updates has more information on this.

The version of this step is also bundled into an Octopus Server version above (2023.1.6994) so that it is available to any instances that don't have automatic updates enabled.

geofflamrock avatar Jan 20 '23 00:01 geofflamrock

:tada: The fix for this issue has been released in:

Release stream Release
2023.1 2023.1.6994
2023.2+ all releases

Octobob avatar Feb 17 '23 06:02 Octobob

🎉🎉🎉🎉

nspencerh avatar Feb 17 '23 14:02 nspencerh