enhancements icon indicating copy to clipboard operation
enhancements copied to clipboard

Add Windows pod sandbox stats information to KEP 2371 - cri pod container stats

Open jsturtevant opened this issue 2 years ago • 8 comments

Signed-off-by: James Sturtevant [email protected]

  • One-line PR description:

Adding Windows stats details to the cri pod container stats KEP

  • Issue link: https://github.com/kubernetes/enhancements/issues/2371
  • Other comments:

After opening https://github.com/kubernetes/kubernetes/pull/110754 we discovered there was more design work need for Windows and was suggested at sig-node to do the work in KEP.

/sig windows /sig node

jsturtevant avatar Jul 22 '22 03:07 jsturtevant

/cc @haircommander @bobbypage @marosset @dcantah

jsturtevant avatar Jul 22 '22 03:07 jsturtevant

/assign @haircommander @bobbypage @dchen1107

jsturtevant avatar Jul 29 '22 15:07 jsturtevant

one more set of notes, but the approach LGTM

haircommander avatar Jul 30 '22 15:07 haircommander

/lgtm

marosset avatar Aug 01 '22 16:08 marosset

all that's left are nits from my perspective, and can be handled as a follow up

/lgtm

haircommander avatar Aug 02 '22 13:08 haircommander

/assign @mikebrow @dchen1107

jsturtevant avatar Aug 02 '22 22:08 jsturtevant

/assign @mrunalp maybe? I think Dawn is out-of-office today.

marosset avatar Aug 02 '22 23:08 marosset

The original review comments in the first KEP around linux/windows fields asked for linux/windows specific structs to allow for differences, and that was agreed. Certain duplicate fields across all known platforms (more platforms coming) could be allowed in the base struct and we can comment fields with platform exceptions if needed, but that doesn't mean we have to make the pod stats response generic across platforms at the CRI api and deprecate linux differences that, I believe, will become more evident if and when pod resources currently being managed at the kubelet start to be managed at the lower levels.

My gut says keep with the original plan/path rather than start deprecating right away. If the windows team really wants to use the exact same fields, I don't see how that would be more complex for kubelet.

mikebrow avatar Sep 05 '22 03:09 mikebrow

The original review comments in the first KEP around linux/windows fields asked for linux/windows specific structs to allow for differences, and that was agreed. Certain duplicate fields across all known platforms (more platforms coming) could be allowed in the base struct and we can comment fields with platform exceptions if needed, but that doesn't mean we have to make the pod stats response generic across platforms at the CRI api and deprecate linux differences that, I believe, will become more evident if and when pod resources currently being managed at the kubelet start to be managed at the lower levels.

My gut says keep with the original plan/path rather than start deprecating right away. If the windows team really wants to use the exact same fields, I don't see how that would be more complex for kubelet.

@haircommander @marosset what are your thoughts?

jsturtevant avatar Sep 06 '22 16:09 jsturtevant

The original review comments in the first KEP around linux/windows fields asked for linux/windows specific structs to allow for differences, and that was agreed. Certain duplicate fields across all known platforms (more platforms coming) could be allowed in the base struct and we can comment fields with platform exceptions if needed, but that doesn't mean we have to make the pod stats response generic across platforms at the CRI api and deprecate linux differences that, I believe, will become more evident if and when pod resources currently being managed at the kubelet start to be managed at the lower levels. My gut says keep with the original plan/path rather than start deprecating right away. If the windows team really wants to use the exact same fields, I don't see how that would be more complex for kubelet.

@haircommander @marosset what are your thoughts?

IIRC when this was discussed on the July 19th meeting we decided there wasn't a known use-case for having separate structs. Someone (I think @dchen1107 and/or @bobbypage ) stated that these metrics were primary used by the kubelet for eviction and metrics-server for scaling purposed. If other non-core components needed more detailed metrics they could get them by other means.

marosset avatar Sep 06 '22 17:09 marosset

IIRC when this was discussed on the July 19th meeting we decided there wasn't a known use-case for having separate structs. Someone (I think @dchen1107 and/or @bobbypage ) stated that these metrics were primary used by the kubelet for eviction and metrics-server for scaling purposed. If other non-core components needed more detailed metrics they could get them by other means.

That was my understanding as well.

At sig-node meeting today, there was an update on the performance for moving the additional cadvisor metrics into the ListStats API: https://docs.google.com/document/d/1Ne57gvidMEWXR70OxxnRkYquAoMpt56o75oZtg-OeBg/edit#heading=h.i2n5jj2i4vso

In the proposal Linux specific fields are added to the stats call. I wasn't aware of this use case, but if this is the direction, then I think we would want to have these fields as separate as was initially proposed.

jsturtevant avatar Sep 06 '22 19:09 jsturtevant

@bobbypage @haircommander @mikebrow I have this on the agenda to discuss tomorrow. I won't be able to make it but @marosset is going to attend. Hopefully we can get some consensus on the direction forward. Thanks!

jsturtevant avatar Sep 12 '22 22:09 jsturtevant

After further discussion at sig-node on Sep 13th, it was decided that we will keep Windows and Linux fields separate.

I've updated the KEP @haircommander @marosset @bobbypage @dcantah

jsturtevant avatar Sep 19 '22 16:09 jsturtevant

/milestone v1.26

marosset avatar Sep 22 '22 16:09 marosset

/lgtm

there's one leading question I added, but I don't think we need to hammer out the details of it yet.

edit: oops, missed https://github.com/kubernetes/enhancements/pull/3439#discussion_r977652394, I'm on standby to re-lgtm :)

haircommander avatar Sep 22 '22 17:09 haircommander

/lgtm

haircommander avatar Sep 22 '22 18:09 haircommander

After further discussion at sig-node on Sep 13th, it was decided that we will keep Windows and Linux fields separate. The KEP has been updated to reflect this.

/assign @dchen1107 @mrunalp @mikebrow

jsturtevant avatar Sep 26 '22 18:09 jsturtevant

/lgtm

marosset avatar Oct 04 '22 20:10 marosset

/assign @derekwaynecarr

haircommander avatar Oct 04 '22 20:10 haircommander

/lgtm

bobbypage avatar Oct 05 '22 07:10 bobbypage

/cc @mikebrow @derekwaynecarr for 1.26 enhancement freeze

jsturtevant avatar Oct 05 '22 21:10 jsturtevant

/approve /lgtm

derekwaynecarr avatar Oct 06 '22 22:10 derekwaynecarr

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: derekwaynecarr, jsturtevant

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Oct 06 '22 22:10 k8s-ci-robot