enhancements
enhancements copied to clipboard
Add Windows pod sandbox stats information to KEP 2371 - cri pod container stats
Signed-off-by: James Sturtevant [email protected]
- One-line PR description:
Adding Windows stats details to the cri pod container stats KEP
- Issue link: https://github.com/kubernetes/enhancements/issues/2371
- Other comments:
After opening https://github.com/kubernetes/kubernetes/pull/110754 we discovered there was more design work need for Windows and was suggested at sig-node to do the work in KEP.
/sig windows /sig node
/cc @haircommander @bobbypage @marosset @dcantah
/assign @haircommander @bobbypage @dchen1107
one more set of notes, but the approach LGTM
/lgtm
all that's left are nits from my perspective, and can be handled as a follow up
/lgtm
/assign @mikebrow @dchen1107
/assign @mrunalp maybe? I think Dawn is out-of-office today.
The original review comments in the first KEP around linux/windows fields asked for linux/windows specific structs to allow for differences, and that was agreed. Certain duplicate fields across all known platforms (more platforms coming) could be allowed in the base struct and we can comment fields with platform exceptions if needed, but that doesn't mean we have to make the pod stats response generic across platforms at the CRI api and deprecate linux differences that, I believe, will become more evident if and when pod resources currently being managed at the kubelet start to be managed at the lower levels.
My gut says keep with the original plan/path rather than start deprecating right away. If the windows team really wants to use the exact same fields, I don't see how that would be more complex for kubelet.
The original review comments in the first KEP around linux/windows fields asked for linux/windows specific structs to allow for differences, and that was agreed. Certain duplicate fields across all known platforms (more platforms coming) could be allowed in the base struct and we can comment fields with platform exceptions if needed, but that doesn't mean we have to make the pod stats response generic across platforms at the CRI api and deprecate linux differences that, I believe, will become more evident if and when pod resources currently being managed at the kubelet start to be managed at the lower levels.
My gut says keep with the original plan/path rather than start deprecating right away. If the windows team really wants to use the exact same fields, I don't see how that would be more complex for kubelet.
@haircommander @marosset what are your thoughts?
The original review comments in the first KEP around linux/windows fields asked for linux/windows specific structs to allow for differences, and that was agreed. Certain duplicate fields across all known platforms (more platforms coming) could be allowed in the base struct and we can comment fields with platform exceptions if needed, but that doesn't mean we have to make the pod stats response generic across platforms at the CRI api and deprecate linux differences that, I believe, will become more evident if and when pod resources currently being managed at the kubelet start to be managed at the lower levels. My gut says keep with the original plan/path rather than start deprecating right away. If the windows team really wants to use the exact same fields, I don't see how that would be more complex for kubelet.
@haircommander @marosset what are your thoughts?
IIRC when this was discussed on the July 19th meeting we decided there wasn't a known use-case for having separate structs. Someone (I think @dchen1107 and/or @bobbypage ) stated that these metrics were primary used by the kubelet for eviction and metrics-server for scaling purposed. If other non-core components needed more detailed metrics they could get them by other means.
IIRC when this was discussed on the July 19th meeting we decided there wasn't a known use-case for having separate structs. Someone (I think @dchen1107 and/or @bobbypage ) stated that these metrics were primary used by the kubelet for eviction and metrics-server for scaling purposed. If other non-core components needed more detailed metrics they could get them by other means.
That was my understanding as well.
At sig-node meeting today, there was an update on the performance for moving the additional cadvisor metrics into the ListStats API: https://docs.google.com/document/d/1Ne57gvidMEWXR70OxxnRkYquAoMpt56o75oZtg-OeBg/edit#heading=h.i2n5jj2i4vso
In the proposal Linux specific fields are added to the stats call. I wasn't aware of this use case, but if this is the direction, then I think we would want to have these fields as separate as was initially proposed.
@bobbypage @haircommander @mikebrow I have this on the agenda to discuss tomorrow. I won't be able to make it but @marosset is going to attend. Hopefully we can get some consensus on the direction forward. Thanks!
After further discussion at sig-node on Sep 13th, it was decided that we will keep Windows and Linux fields separate.
I've updated the KEP @haircommander @marosset @bobbypage @dcantah
/milestone v1.26
/lgtm
there's one leading question I added, but I don't think we need to hammer out the details of it yet.
edit: oops, missed https://github.com/kubernetes/enhancements/pull/3439#discussion_r977652394, I'm on standby to re-lgtm :)
/lgtm
After further discussion at sig-node on Sep 13th, it was decided that we will keep Windows and Linux fields separate. The KEP has been updated to reflect this.
/assign @dchen1107 @mrunalp @mikebrow
/lgtm
/assign @derekwaynecarr
/lgtm
/cc @mikebrow @derekwaynecarr for 1.26 enhancement freeze
/approve /lgtm
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: derekwaynecarr, jsturtevant
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~keps/sig-node/OWNERS~~ [derekwaynecarr]
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment