amazon-cloudwatch-logs-for-fluent-bit
amazon-cloudwatch-logs-for-fluent-bit copied to clipboard
enhance fluentbit process logging for plugin
when templating fails, it's very difficult to identify the source. the logs are flooded with this
time="2022-05-13T14:33:22Z" level=error msg="[cloudwatch 0] parsing log_group_name template '/eks/eks/$(kubernetes['namespace_name'])/$(kubernetes['labels']['k8s-app'])' (using value of default_log_group_name instead): k8s-app: sub-tag name not found"
but there is no way to identify where it is coming from.
@nwsparks could you please explain a little bit about what you want to improve the logging?
@zhonghui12 if the error log I posted from when templating fails contained the name of the pod that was causing this it would help in identifying the source of the error generation.
These error logs generate a TON of volume and in a system with many deployments it is very difficult to track down the source.
edited the subject to make it more clear that this is for the fluentbit process logs
Facing a similar issue. In one hour that has generated 53 million records with the same error "sub-tag name not found":

btw the day on the screenshot was the day we deployed the component to the cluster. We disabled it due to the Cloudwatch high ingestion costs.
Cluster details:
eks version 1.22 helm chart repo: https://aws.github.io/eks-charts helm chart release_name: aws-for-fluent-bit helm chart version: 0.1.18
Getting the same issue. I checked the logs that were forwarded to the default log group in CloudWatch, and saw kubernetes['labels']
clearly contains the sub-tag.
Did you ever fixed this?
We used to run https://github.com/DNXLabs/terraform-aws-eks-cloudwatch-logs
That uses the following config:
set {
name = "cloudWatch.logGroupName"
value = "/aws/eks/${var.cluster_name}/$(kubernetes['labels']['app'])"
}
But that repo is no longer maintained so I just got the Helm chart https://artifacthub.io/packages/helm/aws/aws-for-fluent-bit and set the config value listed above. But now I got the the following errors:
time="2023-03-21T09:49:24Z" level=error msg="[cloudwatch 0] parsing log_group_name template '/aws/eks/staging/$(kubernetes['labels']['app'])' (using value of default_log_group_name instead): app: sub-tag name not found"
@Mattie112 If you're still using the DNXLabs module, please update it to version 0.1.5 One of the changes is that instead of using a label "app" which may not exist, I've set "app.kubernetes.io/name" as it's one of the default k8s labels.
According to the plugin's doco, they've released a new version of the Cloudwatch plugin which brings better performance and other improvements.
The main issue we've seen with the old cloudwatch plugin is that it couldn't handle logs if a label did not exist in a pod definition. The new cloudwatchlogs plugin uses a logGroupTemplate rather than a fixed logroupname, if that template does not exit, it shuffles back to the default logroup "/aws/eks/fluentbit-cloudwatch/logs".
Thanks
Thanks! I have switched to https://github.com/aws/aws-for-fluent-bit I prefer to have something that is maintained :) I don't really see the added benefit of this chart other than saving a few lines of code.
And indeed I am now using the namespace_name
instead of the app
label. I would still prefer the label but hey the namespace is for 99% of our cases fine :)