aws-otel-test-framework icon indicating copy to clipboard operation
aws-otel-test-framework copied to clipboard

containerinsight_eks_prometheus test is failing

Open straussb opened this issue 3 years ago • 2 comments

https://github.com/aws-observability/aws-otel-collector/runs/3416656988?check_suite_focus=true#logs

For some reason the nginx_ingress_controller_nginx_process_connections_total metric is not being picked up by the Collector, even though all the other nginx metrics are.

validator_1  | com.amazon.aoc.exception.BaseException:
│ [ContainerInsight] metric
│ nginx_ingress_controller_nginx_process_connections_total not found with
│ dimension [ClusterName: aws-otel-testing-framework-eks, Namespace:
│ nginx-349dacac375861a3, Service:
│ nginx-349dacac375861a3-ingress-nginx-controller-metrics]

I reproduced the issue and scraped the nginx server's Prometheus endpoint myself, and did see the nginx_ingress_controller_nginx_process_connections_total metric reported:

# HELP nginx_ingress_controller_nginx_process_connections_total total number of connections with state {accepted, handled}
# TYPE nginx_ingress_controller_nginx_process_connections_total counter
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="nginx-aacb4125dbe93c21",controller_pod="nginx-aacb4125dbe93c21-ingress-nginx-controller-64fcfb4cc8zkd4l",state="accepted"} 27333
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="nginx-aacb4125dbe93c21",controller_pod="nginx-aacb4125dbe93c21-ingress-nginx-controller-64fcfb4cc8zkd4l",state="handled"} 27333

I added some logging around here in the Prometheus receiver and saw that the metric was not present even at that point (other nginx metrics were).

For now, we will comment out that metric from the verification.

straussb avatar Aug 27 '21 19:08 straussb

Closing this issue as the tests are successful atm, please reopen this issue for any questions/concerns

vasireddy99 avatar Mar 20 '22 05:03 vasireddy99

That's because this test case is still commented: https://github.com/aws-observability/aws-otel-test-framework/blob/terraform/validator/src/main/resources/expected-data-template/container-insight/eks/prometheus/nginx_metrics.mustache#L46

Please leave the issue open until the test is uncommented and passing.

straussb avatar Mar 21 '22 13:03 straussb