aws-app-mesh-roadmap icon indicating copy to clipboard operation
aws-app-mesh-roadmap copied to clipboard

Feature Request: Support AWS Firelens with App Mesh

Open kiranmeduri opened this issue 5 years ago • 16 comments

Tell us about your request What do you want us to build? As a user, I want to enable access-logs in Envoy via App Mesh and have those logs be published to sinks supported by Fluentd and Fluent Bit.

AWS recently announced firelens that can be used to achieve this, but there is no documentation or recipes on how to set this up.

Which integration(s) is this request for? Any

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Stream Envoy access-logs to fluentd supported destinations.

kiranmeduri avatar Oct 18 '19 20:10 kiranmeduri

Discussing this in our weekly triage, we've come up with an initial set of action items here

Support Firelens

  • [x] Root cause and remediate bugs when launching an ECS task with Firelens and App Mesh
  • [ ] Make traffic from Firelens not transit through the Envoy proxy
  • [ ] Provide a reference example showing how to configure FireLens with App Mesh

Feature Improvements to make FireLens w/ App Mesh better

  • [ ] Allow customers to more dynamically configure access logs on a Virtual Node (ex. JSON or templates)
  • [ ] XDS interaction logs

dastbe avatar Oct 23 '19 20:10 dastbe

@PettitWesley can you provide update on this issue. Thanks

kiranmeduri avatar Jan 29 '20 15:01 kiranmeduri

@kiranmeduri Current testing shows that the CloudWatch Fluent Bit plugin now works with App Mesh.

Root cause and remediate bugs when launching an ECS task with Firelens and App Mesh

The bugs seem to all be remediated.

Provide a reference example showing how to configure FireLens with App Mesh

@CarmenAPuccio has this piece.

PettitWesley avatar Feb 06 '20 19:02 PettitWesley

@kiranmeduri - The blog went live yesterday and we have the walkthroughs for EKS and ECS on Fargate/FireLens.

I can add links to those repos and the blog in aws/aws-app-mesh-examples. Would you just want a folder under /examples called fluent-bit?

CarmenAPuccio avatar Feb 27 '20 16:02 CarmenAPuccio

There is one open question here. Is firelens traffic flowing through Envoy? If so it should not because it is actually monitoring Envoy. I would like to see if fluentbit traffic can bypass Envoy. Today it is done by setting User:1337 on container. But AFAIK, that is not allowed with Firelens container. Please confirm @PettitWesley.

kiranmeduri avatar Mar 26 '20 19:03 kiranmeduri

@kiranmeduri Yeah, with how things work right now, the UID for the FireLens container has to be 0.

PettitWesley avatar Mar 26 '20 19:03 PettitWesley

Yep just confirmed. If you try and set the user field on the FireLens log router you get this:

An error occurred (ClientException) when calling the RegisterTaskDefinition operation: If 'user' field is specified on firelens container, then 'UID' has to be '0'.

CarmenAPuccio avatar Mar 26 '20 20:03 CarmenAPuccio

I think the GID can be anything though- is there a way you can set that to bypass envoy?

PettitWesley avatar Mar 26 '20 20:03 PettitWesley

We had issues with FireLens + AppMesh + using a output other than CloudwatchLogs (in our case ElasticSearch).

We tried a couple of things but the only thing that worked (thanks @PettitWesley ):

  • set IgnoredGID to 1337 in the proxyConfiguration of the task-definition
  • run envoy container with user 1337:1337
  • run logrouter/fluentbit container with user 0:1337

It would be great though to know why this was necessary. It almost seems like AppMesh was interfering with the traffic between the Fargate Host and the logrouter?

lifeofguenter avatar Dec 10 '20 21:12 lifeofguenter

@lifeofguenter We also hit this issue. The mentioned solution works only when using the Fargate platform version 1.3.0 — once we switch to 1.4.0 logging breaks without obvious reason. log_router/fluent bit does not log anything after the bootup process.

thisismana avatar Jan 20 '21 21:01 thisismana

@thisismana our solution works for us with 1.4.0

lifeofguenter avatar Jan 26 '21 06:01 lifeofguenter

I confirm that the solution works. @thisismana make sure that you have only IgnoredGID in the proxy configuration. It was not working for me at first because I had both IgnoredUID and IgnoredGID. Thank you @lifeofguenter for this solution.

kamilhristov avatar Jan 26 '21 07:01 kamilhristov

@kamilhristov nicely spotted. We set both IgnoredGID and IgnoredUID and it did not work (failing silently). But setting only IgnoredGID: 1337 with uid:gid for envoy as 1337:1337 and fluentbit as 0:1337 did the trick. I'm so grateful ❤️

thisismana avatar Jan 27 '21 16:01 thisismana

@thisismana @kamilhristov @lifeofguenter @thisismana What endpoints is FireLens sending data to? AWS endpoints? VPC endpoints? Public endpoints (ex datadog)?

I'm trying to figure out if setting IgnoredUID is always required with FireLens or if it depends on what endpoint FireLens needs to talk to.

PettitWesley avatar May 14 '21 20:05 PettitWesley

@PettitWesley in our case we were forwarding logs to an internal alb.

lifeofguenter avatar May 14 '21 21:05 lifeofguenter

We are forwarding to AWS endpoint - Kinesis Firehose.

kamilhristov avatar May 17 '21 06:05 kamilhristov