do-agent icon indicating copy to clipboard operation
do-agent copied to clipboard

Unable to gather advanced metrics, ERROR "was collected before with the same name and label values"

Open yevon opened this issue 4 years ago • 10 comments

Describe the problem

When I install the advanced kube-state-metrics deployment, the dashboard for gathering metrics stops working. If I check do-agent pod logs, I see some errors, stating duplicated attribute values. I followed this guide for activating advanced metrics:

https://www.digitalocean.com/docs/kubernetes/how-to/monitor-advanced/

If I uninstall advanced-metrics or scale the pods to 0, dashboard starts working again.

Steps to reproduce

It happens with kube-state-metrics:2.0.0-alpha

Expected behavior

Be able to get advanced pod schediling metrics.

System Information

Digital Ocean managed kubernetes 1.18.8

do-agent information:

do-agent-log

2020-09-27T13:03:09.553931294Z ERROR: 2020/09/27 13:03:09 /home/do-agent/cmd/do-agent/run.go:60: failed to gather metrics: 45 error(s) occurred: 2020-09-27T13:03:09.553990229Z * collected metric "kube_daemonset_status_number_available" { gauge:<value:2 > } was collected before with the same name and label values 2020-09-27T13:03:09.554014114Z * collected metric "kube_daemonset_status_number_available" { gauge:<value:2 > } was collected before with the same name and label values 2020-09-27T13:03:09.554019932Z * collected metric "kube_daemonset_status_number_available" { gauge:<value:2 > } was collected before with the same name and label values 2020-09-27T13:03:09.554042459Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554048024Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554052770Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554057690Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554062350Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554067062Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554071742Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554076542Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554081945Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554086634Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554092869Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554097734Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554102457Z * collected metric "kube_daemonset_status_number_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554127517Z * collected metric "kube_daemonset_status_number_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554133235Z * collected metric "kube_daemonset_status_number_unavailable" { gauge:<value:0 > } was collected before with the same name and label values 2020-09-27T13:03:09.554138043Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554148110Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554153232Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:2 > } was collected before with the same name and label values 2020-09-27T13:03:09.554157955Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554162715Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554167703Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554175570Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554183135Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554219858Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554241193Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554249105Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554256065Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554263653Z * collected metric "kube_daemonset_status_desired_number_scheduled" { gauge:<value:2 > } was collected before with the same name and label values 2020-09-27T13:03:09.554270760Z * collected metric "kube_daemonset_status_desired_number_scheduled" { gauge:<value:2 > } was collected before with the same name and label values 2020-09-27T13:03:09.554305026Z * collected metric "kube_daemonset_status_desired_number_scheduled" { gauge:<value:2 > } was collected before with the same name and label values 2020-09-27T13:03:09.554315521Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554323804Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554331593Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554338594Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:2 > } was collected before with the same name and label values 2020-09-27T13:03:09.554345304Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554362434Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554397076Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554401950Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554406640Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554411351Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554417625Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values 2020-09-27T13:03:09.554422403Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values

yevon avatar Sep 27 '20 13:09 yevon

@yevon I will look into this soon, and see if I can reproduce it on my end! Thanks for the report!

bsnyder788 avatar Sep 29 '20 20:09 bsnyder788

I wasn't able to reproduce this issue @yevon. We had a similar report for Ubuntu 20.04 on disk metrics collection that I just addressed in 3.8.0. This kube metrics one would not be able to be easily ignored like that one though since these are metrics we actually want to collect. Did you ever dig any deeper on your end?

bsnyder788 avatar Nov 02 '20 13:11 bsnyder788

I didn't check again with latest version, but I reproduced this issue exactly in another kubernetes cluster in another zone. Just a new kubernetes cluster and following DO documentation for installing advanced metrics. It might be fixed within latest advanced metrics or kubernetes version, when I have time to check it again I will let you know.

yevon avatar Nov 02 '20 14:11 yevon

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed if no further activity occurs.

stale[bot] avatar Jan 03 '21 14:01 stale[bot]

still valid

bsnyder788 avatar Jan 04 '21 18:01 bsnyder788

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed if no further activity occurs.

stale[bot] avatar Jun 04 '21 15:06 stale[bot]

still valid

bsnyder788 avatar Jun 04 '21 16:06 bsnyder788

@bsnyder788 if you add the bug tag to this bug then the stale bot will stop marking it as stale. I believe that's the correct tag, but you can look it up in the stale bot settings for this repo.

blockloop avatar Jul 02 '21 17:07 blockloop