metrics-server Does metrics-server scrape node metrics too early ?

Hello !

I would like to share a problem about node scraping : metrics-server seems to scrape node before it's in nodeready state.

Logs are showing scraping error of a new node during a minute and then scraping is done correctly. Capture d’écran 2022-09-20 à 17 46 08

Is there any way to delay node scraping ? Or does metrics-server start do scrape metrics before the node is in nodeready state ?

We are watching this on all our EKS cluster.

Thank you!

Sep 20 '22 15:09 jeanmercierswile

Hi, @jeanmercierswile, Thanks for the feedback, I'm sure that metrics-server just lists all the nodes in the cluster from kubernetes, and then scrapes it, it doesn't judge the status of the nodes. I wonder, Is it necessary for us to filter node states? /cc @serathius

Sep 21 '22 08:09 yangjunmyfm192085

Hello and thank you @yangjunmyfm192085 ! Does it deserve a feature request if these error logs are useless ?

Sep 22 '22 07:09 jeanmercierswile

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Dec 21 '22 08:12 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Jan 20 '23 09:01 k8s-triage-robot

As @yangjunmyfm192085 pointed out, metrics-server simply lists the nodes, and gets their usage metrics, however, suppressing transient errors could lead to false positives in other cases.

Jan 30 '23 08:01 rexagod

Closing for now. /close

Jan 30 '23 09:01 rexagod

@rexagod: Closing this issue.

In response to this:

Closing for now. /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Jan 30 '23 09:01 k8s-ci-robot

metrics-server metrics-server copied to clipboard

Does metrics-server scrape node metrics too early ?

metrics-server
metrics-server copied to clipboard