enhancements Add Windows pod sandbox stats information to KEP 2371

Signed-off-by: James Sturtevant [email protected]

One-line PR description:

Adding Windows stats details to the cri pod container stats KEP

Issue link: https://github.com/kubernetes/enhancements/issues/2371

Other comments:

After opening https://github.com/kubernetes/kubernetes/pull/110754 we discovered there was more design work need for Windows and was suggested at sig-node to do the work in KEP.

/sig windows /sig node

Jul 22 '22 03:07 jsturtevant

/cc @haircommander @bobbypage @marosset @dcantah

Jul 22 '22 03:07 jsturtevant

/assign @haircommander @bobbypage @dchen1107

Jul 29 '22 15:07 jsturtevant

one more set of notes, but the approach LGTM

Jul 30 '22 15:07 haircommander

/lgtm

Aug 01 '22 16:08 marosset

all that's left are nits from my perspective, and can be handled as a follow up

/lgtm

Aug 02 '22 13:08 haircommander

/assign @mikebrow @dchen1107

Aug 02 '22 22:08 jsturtevant

/assign @mrunalp maybe? I think Dawn is out-of-office today.

Aug 02 '22 23:08 marosset

The original review comments in the first KEP around linux/windows fields asked for linux/windows specific structs to allow for differences, and that was agreed. Certain duplicate fields across all known platforms (more platforms coming) could be allowed in the base struct and we can comment fields with platform exceptions if needed, but that doesn't mean we have to make the pod stats response generic across platforms at the CRI api and deprecate linux differences that, I believe, will become more evident if and when pod resources currently being managed at the kubelet start to be managed at the lower levels.

My gut says keep with the original plan/path rather than start deprecating right away. If the windows team really wants to use the exact same fields, I don't see how that would be more complex for kubelet.

Sep 05 '22 03:09 mikebrow

The original review comments in the first KEP around linux/windows fields asked for linux/windows specific structs to allow for differences, and that was agreed. Certain duplicate fields across all known platforms (more platforms coming) could be allowed in the base struct and we can comment fields with platform exceptions if needed, but that doesn't mean we have to make the pod stats response generic across platforms at the CRI api and deprecate linux differences that, I believe, will become more evident if and when pod resources currently being managed at the kubelet start to be managed at the lower levels.

My gut says keep with the original plan/path rather than start deprecating right away. If the windows team really wants to use the exact same fields, I don't see how that would be more complex for kubelet.

@haircommander @marosset what are your thoughts?

Sep 06 '22 16:09 jsturtevant

The original review comments in the first KEP around linux/windows fields asked for linux/windows specific structs to allow for differences, and that was agreed. Certain duplicate fields across all known platforms (more platforms coming) could be allowed in the base struct and we can comment fields with platform exceptions if needed, but that doesn't mean we have to make the pod stats response generic across platforms at the CRI api and deprecate linux differences that, I believe, will become more evident if and when pod resources currently being managed at the kubelet start to be managed at the lower levels. My gut says keep with the original plan/path rather than start deprecating right away. If the windows team really wants to use the exact same fields, I don't see how that would be more complex for kubelet.

@haircommander @marosset what are your thoughts?

IIRC when this was discussed on the July 19th meeting we decided there wasn't a known use-case for having separate structs. Someone (I think @dchen1107 and/or @bobbypage ) stated that these metrics were primary used by the kubelet for eviction and metrics-server for scaling purposed. If other non-core components needed more detailed metrics they could get them by other means.

Sep 06 '22 17:09 marosset

IIRC when this was discussed on the July 19th meeting we decided there wasn't a known use-case for having separate structs. Someone (I think @dchen1107 and/or @bobbypage ) stated that these metrics were primary used by the kubelet for eviction and metrics-server for scaling purposed. If other non-core components needed more detailed metrics they could get them by other means.

That was my understanding as well.

At sig-node meeting today, there was an update on the performance for moving the additional cadvisor metrics into the ListStats API: https://docs.google.com/document/d/1Ne57gvidMEWXR70OxxnRkYquAoMpt56o75oZtg-OeBg/edit#heading=h.i2n5jj2i4vso

In the proposal Linux specific fields are added to the stats call. I wasn't aware of this use case, but if this is the direction, then I think we would want to have these fields as separate as was initially proposed.

Sep 06 '22 19:09 jsturtevant

@bobbypage @haircommander @mikebrow I have this on the agenda to discuss tomorrow. I won't be able to make it but @marosset is going to attend. Hopefully we can get some consensus on the direction forward. Thanks!

Sep 12 '22 22:09 jsturtevant

After further discussion at sig-node on Sep 13th, it was decided that we will keep Windows and Linux fields separate.

I've updated the KEP @haircommander @marosset @bobbypage @dcantah

Sep 19 '22 16:09 jsturtevant

/milestone v1.26

Sep 22 '22 16:09 marosset

/lgtm

there's one leading question I added, but I don't think we need to hammer out the details of it yet.

edit: oops, missed https://github.com/kubernetes/enhancements/pull/3439#discussion_r977652394, I'm on standby to re-lgtm :)

Sep 22 '22 17:09 haircommander

/lgtm

Sep 22 '22 18:09 haircommander

After further discussion at sig-node on Sep 13th, it was decided that we will keep Windows and Linux fields separate. The KEP has been updated to reflect this.

/assign @dchen1107 @mrunalp @mikebrow

Sep 26 '22 18:09 jsturtevant

/lgtm

Oct 04 '22 20:10 marosset

/assign @derekwaynecarr

Oct 04 '22 20:10 haircommander

/lgtm

Oct 05 '22 07:10 bobbypage

/cc @mikebrow @derekwaynecarr for 1.26 enhancement freeze

Oct 05 '22 21:10 jsturtevant

/approve /lgtm

Oct 06 '22 22:10 derekwaynecarr

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: derekwaynecarr, jsturtevant

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/sig-node/OWNERS~~ [derekwaynecarr]

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

Oct 06 '22 22:10 k8s-ci-robot

enhancements
enhancements copied to clipboard

Add Windows pod sandbox stats information to KEP 2371 - cri pod container stats

enhancements enhancements copied to clipboard

Add Windows pod sandbox stats information to KEP 2371 - cri pod container stats

enhancements
enhancements copied to clipboard