containers-roadmap icon indicating copy to clipboard operation
containers-roadmap copied to clipboard

[EKS] Limit the predefined Container Insights metrics that needs to be ingested in CloudWatch

Open abhishek181 opened this issue 4 years ago • 17 comments

Which service(s) is this request for? EKS

Tell us about your request CloudWatch Container Insights for EKS should have an option for excluding some of the metrics which are not required in order to save the cost of custom metrics.

Configurations Ability to select which metrics to exclude such as pod_network_rx_bytes, pod_network_tx_bytes, etc.

abhishek181 avatar Dec 30 '19 11:12 abhishek181

I am glad this was written. I was about to write the same suggestion. Running cloud insights in my clusters is almost 40% the cost of some of my clusters. A cost optimized implementation or a cost savings would make using this service much more affordable. The majority of the cost are the custom metrics x number of nodes x number of k8 resources running (pods, ns, services). The metric selection to turn off is a great idea. Possibly allowing to limit the resources you want insights is better. Such as turn off namespace or services to just focus on the pod and node metrics.

Or simply give a bigger volume discount so we get to keep it all. The last cloudwatch price decrease was on Nov 21, 2016 .

How it is currently priced:

There is a predefined number of metrics reported for every cluster, node, pod, and service. Every cluster reports 24 metrics; every node reports 8 metrics; every pod reports 9 metrics; and every service reports 6 metrics. CloudWatch metrics are aggregated by pod, service, and namespace using their name. Increasing the count of running instances will not impact the count of CloudWatch metrics generated. All CloudWatch metrics are prorated on an hourly basis. This example assumes that data points are reported for the entire month.

Monthly number of CloudWatch metrics per cluster = 24 cluster metrics + (10 nodes or EC2 instances * 8 node metrics) + (20 unique pod names * 9 pod metrics * 1 namespace) + (5 unique service names * 6 service metrics * 1 namespace) + (1 unique namespace * 6 namespace metrics) = 24 + (10 * 8) + (20 * 9 * 1) + (5 * 6 * 1) + (1 * 6) = 320 CloudWatch metrics

https://aws.amazon.com/cloudwatch/pricing/

maguire007 avatar Apr 19 '20 11:04 maguire007

+1

norbertdalecker avatar Jun 05 '20 09:06 norbertdalecker

More than having to limit the number of metrics collected, I wish we could see AWS Container Insights metrics become AWS provided metrics free of charge. The current pricing is very harsh, and pushes people to seek for alternative monitoring solutions in order to save on concurrent costs.

hhamalai avatar Sep 25 '20 10:09 hhamalai

Any update on this? Just being able to specify which metrics to push in cw agent config map would be perfect

aittam avatar Jan 29 '21 16:01 aittam

This is a critical feature for my Org where we have 2K+ pods that includes dev and prod. We don't want to track metrics on all namespaces/pods and at same time not all metrics tracking is needed. Having this feature will help in cost optimisation. Hoping to see this feature soon.

GowthamSenAurea avatar Mar 25 '21 10:03 GowthamSenAurea

There are related issue and PRs in cwagent repo

  • https://github.com/aws/amazon-cloudwatch-agent/issues/103 we may plan to allow turning off custom metrics and only generates structured logs, the logs are in EMF format so metrics is always extracted
  • https://github.com/aws/amazon-cloudwatch-agent/pull/163 we merged a PR to allow using annotation disable to sending log for annotated pods, the metrics are from emf log, so both log and metrics will be gone.

Filtering which metrics to send is a bit more complex, I am not sure if we will implement it and include it in next release.

pingleig avatar Apr 30 '21 17:04 pingleig

As these custom metrics are created by AWS documentation AND utilized by AWS Container Insights Dashboard, these metrics should not be under "custom metrics", but under AWS metrics, and they should not generate costs for customers in this indirect manner.

hhamalai avatar Jun 01 '21 07:06 hhamalai

Just like @hhamalai stated above. One chooses EKS and CW because of the mantra AWS tends to repeat "Let AWS handle the heavy lifting". In this case it bites you badly. These costs are unforeseen and cause unnecessary wtf-moments when checking billing.

ristkari avatar Nov 18 '21 14:11 ristkari

Paste from https://github.com/aws/amazon-cloudwatch-agent/issues/103#issuecomment-972362366 It's possible to do that with opentelemetry collector https://aws-otel.github.io/docs/getting-started/container-insights/eks-infra#advanced-usage

With the default configuration, the ADOT Collector collects the complete set of metrics as defined in this AWS public document. To reduce the AWS cost for the CloudWatch metrics and embedded metric format logs generated by Container Insights, you can customize the ADOT Collector using the following two methods.

https://aws-otel.github.io/docs/getting-started/container-insights/eks-infra#configure-metrics-sent-by-cloudwatch-embedded-metric-format-exporter Mentions how to remove pod network metrics etc.

pingleig avatar Nov 18 '21 19:11 pingleig

There is a new blog on using ADOT to customize metrics https://aws.amazon.com/blogs/containers/cost-savings-by-customizing-metrics-sent-by-container-insights-in-amazon-eks/

pingleig avatar Dec 17 '21 23:12 pingleig

I installed ADOT and it worked. I spent quite a lot time to find a way to filter by namespace which the method was not introduced in the AWS blog. Finally I found that it can be done by using resource_attributes conditions as described in this document, and this document lists all the resource attributes for aws ContainerInsights metrics.

filter process snippet I tried to filter by namespace:

 filter/include:
        metrics:
          include:
            match_type: regexp
            metric_names:
              - ^pod_.*
              - ^service_.*
              - namespace_number_of_running_pods
            resource_attributes:
              - Key: Namespace
              - Value: (include-namespace-1|include-namespace-2)

filter/exclude:
        metrics:
          include:
            match_type: regexp
            resource_attributes:
              - Key: Namespace
              - Value: (exclude-namespace-1|exclude-namespace-2)

And if you added metric_names in filter/exculde, the namespace condition seems not work, all metrics in metrics_names of all namespaces will be excluded whenever the namespace condition is added or not. So in filter/exclude, just set namespace condition and all metrics in the namespaces will be excluded at all.

jizg avatar Dec 23 '21 08:12 jizg

Agreeing with previous comments, the current pricing for monitoring EKS using CW is rocket high. Even with a small cluster (~30 pods), it costs more than the EC2s we launched.

Any progress to support this feature or at least make the pricing lower than today? We are actively looking for alternatives these days.

imZack avatar Mar 15 '22 02:03 imZack

With amazon-cloudwatch-agent you can exclude pods by adding the k8s annotation aws.amazon.com/cloudwatch-agent-ignore: true , that is defined in plugins/processors/k8sdecorator/stores/podsstore.go but I agree that amazon-cloudwatch-agent should have a more flexible approach to exclude metrics (see amazon-cloudwatch-agent #401 for a proposal)

ecerulm avatar Mar 15 '22 07:03 ecerulm

Is there any update on this issue?

GreasyAvocado avatar Jul 12 '23 22:07 GreasyAvocado

High cardinality should not be the default!

@GreasyAvocado I believe the answer is to switch to ADOT, where you have much greater control over the filtering to include/exclude metrics.

Dumping some links

Install Overview: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/deploy-container-insights-EKS.html

Install Guide: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-EKS-otel.html

Install Tutorial - customising metrics published (supports logs too, but you don’t need to check if your cluster is tanking due to resource utilization): https://aws-otel.github.io/docs/getting-started/container-insights/eks-infra

Helm Chart: https://github.com/aws-observability/aws-otel-helm-charts/tree/main/charts/adot-exporter-for-eks-on-ec2

james-arawhanui avatar Oct 30 '23 21:10 james-arawhanui

That's indeed what I ended up doing.

I actually went with https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/extension/observer/ecsobserver/README.md, instead of ADOT, but it's not very different.

GreasyAvocado avatar Oct 31 '23 14:10 GreasyAvocado

I believe this would still be a very useful feature with enhanced Observability. I just installed it last week using the suggested method (EKS add-on) and the generated metrics/costs for my 2-node test cluster are already twice as high as AWS Cloud Watch pricing example.

Are there any updates?

tgraupne avatar Apr 08 '24 14:04 tgraupne