yet-another-cloudwatch-exporter
yet-another-cloudwatch-exporter copied to clipboard
Regression in 0.26.3: ALB metrics with TargetGroup dimension are not available in Prometheus
With this simple config and yace 0.25.0-alpha aws_alb_tg_healthy_host_count_average and aws_alb_tg_un_healthy_host_count_average metrics are available in Prometheus:
discovery:
exportedTagsOnMetrics:
alb:
- Name
jobs:
- type: alb
regions:
- eu-west-1
period: 300
length: 300
awsDimensions:
- TargetGroup
metrics:
- name: HealthyHostCount
statistics:
- Average
- name: UnHealthyHostCount
statistics:
- Average
But with yace version 0.26.3-alpha those metrics are not available in Prometheus even if the log says that they were scraped.
Here is the log from version 0.25.0.-alpha: yace-0.25.0.log
And here is the log from version 0.26.3-alpha: yace-0.26.3.log
The same behavior is the same also for other ALB CloudWatch metrics with TargetGroup dimension like HealthyHostCount, HTTPCode_Target_2XX_Count, TargetResponseTime, ...
Interesting is that metrics with LoadBalancer dimension work in both versions 0.25.0-alpha and 0.26.3-alpha.
We're also seeing the same problem when upgrading to 0.27. Reverting to 0.25 appears to bring the ALB metrics back again.
Happy to provide any other info if it will help troubleshooting..
It seems these series...
aws_alb_tg_target_response_time_average
are under this name...
aws_alb_target_response_time_average
CloudWatch - Per AppELB dimension_AvailabilityZone="" dimension_LoadBalancer="myalb" dimension_TargetGroup=""
CloudWatch - Per AppELB, per TG dimension_AvailabilityZone="" dimension_LoadBalancer="myalb" dimension_TargetGroup="mytargetgroup"
CloudWatch - Per AppELB, per AZ dimension_AvailabilityZone="ap-southeast-2a" dimension_LoadBalancer="myalb" dimension_TargetGroup=""
CloudWatch - Per AppELB, per AZ, per TG dimension_AvailabilityZone="ap-southeast-2a" dimension_LoadBalancer="myalb" dimension_TargetGroup="mytargetgroup"
Depending on what combination of dimensions you want, you can filter. However, I have noticed that the tag_Labels are almost always taken from the ALB and not the TargetGroup. Might be related to https://github.com/ivx/yet-another-cloudwatch-exporter/issues/379.
I noticed that in version 0.26, they also removed the option for providing awsDimensions in the config: diff
I removed that part in the config and merged all the metrics under the same job and it seems that it is fetching them.
discovery:
exportedTagsOnMetrics:
alb:
- YYY
jobs:
- regions:
- XXX
type: alb
length: 60
period: 60
delay: 120
metrics:
- name: HealthyHostCount
statistics:
- Average
- name: UnHealthyHostCount
statistics:
- Average
As kyleplant pointed out, metrics are no longer prefixed with aws_alb_tg, but aws_alb only, so aws_alb_tg_healthy_host_count_average is now under aws_alb_healthy_host_count_average for example. Tag labels seem broken.
I plan to release this week.
Super stuffed but already merged some PRs and will try to fix this as well.
Hello @thomaspeitz,
Do you know if this was merged ? I still seem to have the same issue with 0.33.0-alpha and this issue is still open so i wondered :)
Thanks
Hi @thomaspeitz,
Can you please update regarding this issue? This bug basically broke the target group metrics making them unusuable for almost one year now.
We are unable to upgrade YACE because of this bug.
Wow super annoying. I am sorry for this. I am open to review PRs and release to get this fixed!
Currently having no time for unpaid work by myself. Sorry.
Hi. Is this issue resolved @thomaspeitz ? If so, in which version?
I have tested today with the latest version and the dimensions are back !
After looking a bit in the history, it looks like the version 0.37.0-alpha has fixed it with this PR https://github.com/nerdswords/yet-another-cloudwatch-exporter/pull/571
The config has changed quite a bit since version 0.25.0-alpha, but I think I have all similar metrics by using it as follows
apiVersion: v1alpha1
discovery:
jobs:
- type: AWS/ApplicationELB
regions:
- XXX
dimensionNameRequirements:
- LoadBalancer
includeContextOnInfoMetrics: true
metrics:
- name: ActiveConnectionCount
statistics:
- Sum
...
- type: AWS/ApplicationELB
regions:
- XXX
dimensionNameRequirements:
- LoadBalancer
- TargetGroup
includeContextOnInfoMetrics: true
metrics:
- name: HealthyHostCount
statistics:
- Average
...
Hope this helps everyone that was still stuck on the old version