kubernetes icon indicating copy to clipboard operation
kubernetes copied to clipboard

kubelet: improve CRI stats for resource metrics and testing

Open dims opened this issue 3 weeks ago • 6 comments

properly support the resource metrics endpoint when PodAndContainerStatsFromCRI is enabled and fix the related e2e tests.

Stats Provider:

  • add container-level CPU and memory stats to ListPodCPUAndMemoryStats so the resource metrics endpoint has complete data
  • add aggregatePodSwapStats to compute pod-level swap from container stats (CRI doesn't provide pod-level swap directly)
  • add missing memory stats fields: AvailableBytes, PageFaults, and MajorPageFaults
  • add platform-specific implementations for Linux and Windows

Tests:

  • skip cAdvisor metrics test when PodAndContainerStatsFromCRI is enabled (cAdvisor metrics aren't available in that mode)
  • fix expected metrics in ResourceMetricsAPI test
  • node_swap_usage_bytes is only available with cAdvisor (need to verify!)
  • Add dumpResourceMetricsForPods helper to log actual metric values when tests fail, making debugging easier

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Which issue(s) this PR is related to:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


dims avatar Dec 05 '25 04:12 dims

Please note that we're already in Test Freeze for the release-1.35 branch. This means every merged PR will be automatically fast-forwarded via the periodic ci-fast-forward job to the release branch of the upcoming v1.35.0 release.

Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Fri Dec 5 03:34:55 UTC 2025.

k8s-ci-robot avatar Dec 05 '25 04:12 k8s-ci-robot

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Dec 05 '25 04:12 k8s-ci-robot

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Dec 05 '25 04:12 k8s-ci-robot

xref: https://github.com/containerd/containerd/pull/12629

dims avatar Dec 05 '25 04:12 dims

(some testing done in https://github.com/containerd/containerd/pull/12620 using branch https://github.com/dims/kubernetes/tree/add-logs-to-kubelet-metrics)

dims avatar Dec 05 '25 04:12 dims

/assign @SergeyKanzhelev @mrunalp

dims avatar Dec 09 '25 22:12 dims