autoscaler icon indicating copy to clipboard operation
autoscaler copied to clipboard

Create a proposal for Buffers API.

Open jbtk opened this issue 7 months ago • 14 comments

What type of PR is this?

/kind documentation

What this PR does / why we need it:

Adds proposal for Buffers API

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


jbtk avatar May 21 '25 09:05 jbtk

Hi @jbtk. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar May 21 '25 09:05 k8s-ci-robot

The initial discussion about this proposal was on a doc: http://docs/document/d/1bcct-luMPP51YAeUhVuFV7MIXud5wqHsVBDah9WuKeo?tab=t.0

(there are still unresolved conversations there)

jbtk avatar May 21 '25 09:05 jbtk

Summary of open topics:

  • controller vs library used by autoscalers for translating buffers to pod spec
  • k8s quotas limitations (lacking in the proposal, will research that and fill in the blanks in the proposal)

jbtk avatar May 22 '25 10:05 jbtk

/assign @jackfrancis

@jackfrancis - adding you here because I know this is related to things you've thought about in the past with regards to pre-provisioning capacity.

raywainman avatar May 27 '25 14:05 raywainman

Update:

  1. added k8s resource quota handling to proposal
  2. after some discussions I am keeping the proposal to have a separate controller. When implementing we will reconsider embedding pod spec (due to embedding and validation that needs to be copied and maintained)

jbtk avatar Jun 02 '25 11:06 jbtk

@ellistarn as FYI

After an offline review with @towca

  • Changed name Buffer to NodeAutoscalingBuffer
  • Changed the Type field to ProvisioningStrategy and added information that if not specified the default behavior will be active capacity (meaning provision more space in the cluster). The field can be still used for provider specific buffer types if needed
  • Renamed NodeClassBufferCapacity to ResourceBufferCapacity and all related fields
  • Moved targetRef into ReplicaBufferCapcity and ResourceBufferCapacity
  • Added documentation for conditions that will be a starting point for implementation. More conditions can be added in the future if needed.

I also, for now, removed the recreation strategy. There is still one important usecase* that will require some configuration but in order to unblock implementation we decided to remove it for now and reintroduce in next pr to the same AEP and in the meantime start implementation of what we already have. All of these ideas assume that the implementation of logic how many of the capacity to provision will happen in the buffer controller - we need to get the config options right though for the API to be simple and expressive.

We still did not decide whether it is better to embed full pod spec or only a subset of fields related to scheduling. TBD.

  • Usecase being: I am a user who has jobs running. I know how much they usually take (or I have a reservation of a certain size that I pay for anyway) and want to keep buffers up to this capacity.

jbtk avatar Jun 18 '25 12:06 jbtk

Thanks @jbtk, I'll review again soon when I get the chance.

ellistarn avatar Jun 18 '25 19:06 ellistarn

@jbtk, thanks for taking my earlier feedback to heart. I like the direction this is going. I'm not quite grokking the current state of the proposal, as the go structs do not match the yaml specs. Could you make these agree and I'll do another pass? In the meantime, I have a few thoughts:

  1. I like the more specific name that conceptually scopes to "Node". I think we might've gotten a bit too specific, as "NodeBuffer" conveys the same information and is easier to read/write/say. I'd also advocate for NodeHeadroom or NodeReservation ("reserved space in a node, which can be fullfiled by pods").

  2. Have we thought about the permission model of the arbitrary targetRefs? It means that administrators may need to continuously add new permission types to their buffer controller. This gives a slight advantage

  3. I don't understand why it's necessary to include a target inside a resources buffer. Couldn't you let the autoscaler implementation decide which one to choose based off of the scheduling algorithm? For example, in Karpenter, you could use the NodeSelector field in a resources buffer to do:

nodeSelector:
  - key: "karpenter.sh/node-pool"
    value: "default"
    operator: In

It strikes me as much more powerful to use nodeSelector here than to force a target. In Karpenter, we would want the default behavior to simply let the normal provision algorithm decide which NodePool to choose based off of cost/scheduling/limits, etc -- users would only constrain things for specific situations.

  1. I'm still not understanding the point of buffer type. As far as I understand this feature only has one value (default), and the use cases are a bit speculative. I'm not confident that the other use cases theorized wouldn't require additional API surface anyways. I'd recommend dropping it and re-adding it once we have more clarity and are directly implementing the feature. It's very common for controllers to implement implementation-specific features like this using annotations, which unblocks providers from experimenting without baking concepts into the API until they're well understood.

ellistarn avatar Jun 20 '25 23:06 ellistarn

@ellistarn responding

  1. It feels to me that NodeBuffer could incorrectly suggest that the buffer is empty nodes and this is not the intention as it is just spare capacity on the nodes (including spare capacity on the existing nodes). Therefore it feels like NodeBuffer can be misleading and does not represent well what it does. NodeAutoscalingBuffer however feels like a better fit expressing that this is a buffer that is maintained when autoscaling the nodes. I am happy to come up with a better name (I agree it is not great), but I do not think that NodeBuffer is what we need.
  2. What do you mean by "permission model for arbitrary target refs"? If someone wants to support some additional target refs they will need to deploy their own buffer controller(s). Do you mean some mechanism for earlier validation what target refs are available so that people learn earlier if their buffer is incorrect or what?
  3. I think there are a couple of reasons, none of them particularly strong so happy to discuss:
    1. node selectors are very expressive - if we want to offer users with good observability it would be more convenient for them to reference an object directly, if you decide that you want to do this for multiple karpenter pools you can use node selectors and leave the single target empty.
    2. if people want to write buffer controllers for their custom CRDs referencing an object directly will be much easier to handle - not everything related to custom CRDs would be possible to express in the form of node selectors.
  4. The use case of "extensibility for different cloud providers to offer custom types" will always be speculative until we add a field and someone will start using it. With defaulting the user will not need to specify any additional field if they use the default type. If we say that all of this has to be done through annotations we can end up with users copy pasting later the buffer between clouds and being surprised that it was misinterpreted as the annotations were ignored. It is possible that for certain custom buffer types there will be need for additional fields and the only way of adding them will be in annotations but it will be clear that these types have special handling (and the buffer controller can clearly communicate that a given buffer type is not supported).

jbtk avatar Jun 23 '25 08:06 jbtk

Thanks @jbtk,

For 1, I get your point about the potential confusion. Any thoughts to my suggested name: NodeReservation? This one especially makes sense to me if we decide to support node hibernation with this API. It might also be worth exploring a "Pod" prefix (e.g. PodBuffer, PodReservation) for these names to help clarify the subject.

For 2, apologies -- this thought was half baked and the last sentence incomplete. My thought was that arbitrary targets might require customized RBAC, but the design already assumes that each provider will have a separate implementation w/ separate RBAC, so I think this is solved.


For 3/4, I think these can be summarized as "speculative features that try to address future extensibility". This is philosophical, so I can understand if you feel differently, but I'd recommend that we delay the addition of these additional fields/features until the use case is concrete.

For example, if CAS/GKE/Karpenter need targetRef to support something that cannot be easily supported with NodeSelector, let's add a follow-on that builds the extensibility and maps it directly to the use case.

In my experience, building speculative extensibility carries risk that the feature will not fully meet the future requirements once they're known. This can lead to awkward/unnecessary fields that are confusing to customers. We can easily add to APIs with new features, but removing from APIs is much more costly if not impossible. It's valuable to think of potential extension points (e.g. our discussion on fulfilledBy), even if we don't implement them immediately.

ellistarn avatar Jun 23 '25 20:06 ellistarn

@ellistarn

  • 1 reservations are considered to be added on the scheduler level (some docs: older one, newer approach that was rejected, but may come back in a similar form: https://github.com/kubernetes/enhancements/pull/5149). Ultimately we do not reserve the space - only create it and keep it up (it has no guarantees that it will not move between nodes etc as scheduler is unaware of this space being taken). As I said - I do not want to create an impression that Buffers operate on a node level as this is not the intention of this api. I would rather go with ResourceBuffer, but "resource" is again a pretty ambiguous word. Though the resource word matches the resources fields in the pod spec which makes it potentially a match.

  • 3 For target ref in resources capacity - this is not something I have a strong feeling about - I think that for any custom object that is a single thing (GKE custom compute class or any other custom CRD that users would like to write custom buffers for) it is much more convenient to directly point to that object rather than reference it by node selectors (for CCC it happens to be possible, but for custom CRD not necessarily making it almost the same as pod based config with only a feature of splitting the resources between chunks of the buffers). Also with direct ref we could automatically use it for owner ref metadata field. Since this is not required as CCC can be referenced by a node selector (not direct and nice, but working) I can remove this if others also feel that this is not a good idea.

  • 4 As for the type - this is something that has very concrete ideas (like stopped vms or resizeability) and does not have a workaround other than an annotation that has a risk of buffer controller not supporting these and acting as if buffer was for active capacity. This one I would rather leave as is.

Since 3 and 4 is a matter of opinion it would be great for others to chime in. @towca - are you the right person or is there someone else who may have an opinion here?

jbtk avatar Jun 24 '25 18:06 jbtk

@jbtk thanks for addressing my comments! I feel like we're very close to alignment, let's finalize this so that we can start implementing something.

IIUC these are the contentious parts that are preventing us from approving the proposal:

  1. The Buffer name.

    • We need to distinguish that it's a Node-autoscaling-related (vs workload-autoscaling) object, so Buffer is too ambiguous.
    • NodeBuffer suggests that the Buffer is "Node-centric" (i.e. "Buffer consisting of Nodes"), while the semantics are actually more "workload-centric" (i.e. "Buffer for a workload"). I agree that this is confusing and would prefer to avoid this option. NodeHeadroom has the same problem.
    • Including Reservation in the name is IMO confusing because nothing will actually be "reserved" for Buffers until we implement the fulfilledBy semantics. Also there's the name collision with scheduler reservations that are being discussed separately, as @jbtk mentioned.
    • So we need a name that is accurate, distinguishes from workload-autoscaling, doesn't suggest that it's "Node-centric", and doesn't have collisions with other k8s concepts. IMO NodeAutoscalingBuffer, or ResourceBuffer both satisfy all points (the accuracy part is of course subjective).
    • IMO CapacityBuffer would be slightly better than ResourceBuffer. I'd also like to resurface my early suggestion of node-autoscaling.x-k8s.io/Buffer which also seems to satisfy all requirements.
  2. Whether to include TargetRef in ResourceBufferCapacity.

    • @jbtk Do I understand correctly that ResourceBufferCapacity is supposed to mean "buffer for total amount of resources, and the arbitrary CRD just specifies where to create the resources"? If so, I'd name the field something else than TargetRef since the "target" here seems to be the total resources+chunk config (following the ReplicaBufferCapacity pattern where "target" is a workload). Maybe NodeSelectorRef/NodeSelectorObject or something similar?
    • @ellistarn Having the reference to the object gives us better observability (we could use it in kubectl etc). It's also easier for the controller to identify the object to translate (imagine if e.g. we have different default chunking strategies depending on the object). NodeSelector can work for this, but it's imprecise and essentially hacking the semantics.
    • Maybe it would be clearer if we put NodeSelector and the Ref in a separate subobject meant to select where to create the capacity? @ellistarn could you expand on what is your concern here?
  3. The format of replica count/percentage in ReplicaBufferCapacity.

    • As mentioned in the other comment, for this one I'd defer to prior art in K8s API, or an opinion of a K8s API expert.
    • Maybe we can get Tim to do another review pass?
  4. Whether to include the ProvisioningStrategy field.

    • As @jbtk mentioned, we do have concrete use-cases in mind that we can talk about right now. The alternative is coming back to this discussion in a couple of months. I'd prefer to avoid that, as any iteration seems to have a non-trivial overhead of kicking off and maintaining the discussion.
    • We can focus on the simple use-case that @jbtk mentioned - "stand-by"/"stopped" VMs that can be quickly resumed. Having the ProvisioningStrategy field allows the user to specify that the Buffer should be created using such VMs.
    • @ellistarn I agree that we might need more discussion and design later when the use-cases are more concrete, but it seems safe to assume that we'll need to at least be able to distinguish from the default behavior. Which is all that the field does. Are you concerned that we'd want a completely separate API for this use-case?

A lot of this is subjective so it'd be good to hear more opinions. @gjtempleton @jackfrancis @x13n could you weigh in on the points listed above?

towca avatar Jun 25 '25 16:06 towca

@towca @ellistarn

Yesterday I met with @liggitt to discuss embedding the pod spec. Following his advice I decided to get rid of embedding and the the buffer status we will use the pod template and save the generation number to ensure that the template was not modified by the user. This way we will be able to ensure that we do not over provision if we reach resource quota limits. The pod template will have the owner reference set so when buffer is deleted it will be deleted as well.

I also used the opportunity to consult on the IntOrString type. @liggitt told me that in newer APIs we try to avoid these multipurpose fields as they make the processing harder (harder to do validation, some other fields work with one value but not the other). Taking this into account I would rather leave these fields as they were.

jbtk avatar Jun 26 '25 07:06 jbtk

@jbtk thanks for addressing my comments! I feel like we're very close to alignment, let's finalize this so that we can start implementing something.

IIUC these are the contentious parts that are preventing us from approving the proposal:

  1. The Buffer name.

    • We need to distinguish that it's a Node-autoscaling-related (vs workload-autoscaling) object, so Buffer is too ambiguous.
    • NodeBuffer suggests that the Buffer is "Node-centric" (i.e. "Buffer consisting of Nodes"), while the semantics are actually more "workload-centric" (i.e. "Buffer for a workload"). I agree that this is confusing and would prefer to avoid this option. NodeHeadroom has the same problem.
    • Including Reservation in the name is IMO confusing because nothing will actually be "reserved" for Buffers until we implement the fulfilledBy semantics. Also there's the name collision with scheduler reservations that are being discussed separately, as @jbtk mentioned.
    • So we need a name that is accurate, distinguishes from workload-autoscaling, doesn't suggest that it's "Node-centric", and doesn't have collisions with other k8s concepts. IMO NodeAutoscalingBuffer, or ResourceBuffer both satisfy all points (the accuracy part is of course subjective).
    • IMO CapacityBuffer would be slightly better than ResourceBuffer. I'd also like to resurface my early suggestion of node-autoscaling.x-k8s.io/Buffer which also seems to satisfy all requirements.
  2. Whether to include TargetRef in ResourceBufferCapacity.

    • @jbtk Do I understand correctly that ResourceBufferCapacity is supposed to mean "buffer for total amount of resources, and the arbitrary CRD just specifies where to create the resources"? If so, I'd name the field something else than TargetRef since the "target" here seems to be the total resources+chunk config (following the ReplicaBufferCapacity pattern where "target" is a workload). Maybe NodeSelectorRef/NodeSelectorObject or something similar?
    • @ellistarn Having the reference to the object gives us better observability (we could use it in kubectl etc). It's also easier for the controller to identify the object to translate (imagine if e.g. we have different default chunking strategies depending on the object). NodeSelector can work for this, but it's imprecise and essentially hacking the semantics.
    • Maybe it would be clearer if we put NodeSelector and the Ref in a separate subobject meant to select where to create the capacity? @ellistarn could you expand on what is your concern here?
  3. The format of replica count/percentage in ReplicaBufferCapacity.

    • As mentioned in the other comment, for this one I'd defer to prior art in K8s API, or an opinion of a K8s API expert.
    • Maybe we can get Tim to do another review pass?
  4. Whether to include the ProvisioningStrategy field.

    • As @jbtk mentioned, we do have concrete use-cases in mind that we can talk about right now. The alternative is coming back to this discussion in a couple of months. I'd prefer to avoid that, as any iteration seems to have a non-trivial overhead of kicking off and maintaining the discussion.
    • We can focus on the simple use-case that @jbtk mentioned - "stand-by"/"stopped" VMs that can be quickly resumed. Having the ProvisioningStrategy field allows the user to specify that the Buffer should be created using such VMs.
    • @ellistarn I agree that we might need more discussion and design later when the use-cases are more concrete, but it seems safe to assume that we'll need to at least be able to distinguish from the default behavior. Which is all that the field does. Are you concerned that we'd want a completely separate API for this use-case?

A lot of this is subjective so it'd be good to hear more opinions. @gjtempleton @jackfrancis @x13n could you weigh in on the points listed above?

Based on the recent comments, there's another point of contention:

  1. Do we need the controller part for translating a Buffer Spec into the virtual Pods that autoscalers inject?
    • The alternative is to have the translation as part of Node autoscaler responsibilities, leveraging a shared library for the implementation.
    • A hard requirement for the translation is extensibility. Regardless of the solution, we need to be able to translate cloud-provider-specific objects on top of the shared part. We can do that both via a controller and a library with similar ease.
    • Using the controller approach we could move the translation logic out of Node autoscalers to reduce their overall scope. We could also have users run controllers for translating their own custom objects. I agree that both these points are pretty speculative, the first iteration wouldn't be utilizing either.
    • When we take quotas into account, the controller approach allows us to similarly not have the quota logic in Node autoscalers. This is out of scope for this first iteration, but it should be a fast follow and probably a requirement for beta - so I wouldn't call that part speculative. The quota logic is also likely to be more involved and have less affinity to Node autoscalers than the translation logic.
    • We've decided on referencing PodTemplate objects instead of inlining them on the advice of k8s API experts. So if we don't go with the controller approach but still want to have the pod spec in the status for observability, the shared library would have to create PodTemplate objects. This is pretty natural for a controller, but seems weird for such a library.
    • IMO it's pretty subjective which solution is "cleaner" or incurs less cognitive burden. I see the argument for "the proposal is less complex when we take the controller part out" (this was my first intuition as well), but on the other hand I definitely see the argument for "the simpler proposal without controller further bloats Node autoscaler scope, resulting in a more complex end-state". The ever-expanding scope is one of the biggest problems we see in CA, so delegating new not-strictly-autoscaling responsibilities to other components (even if it's just separating them under the same binary to start with) seems worthwhile.
    • IMO out of all the above, the quota part seems the most convincing. We'll have to implement quota logic soon, and I'd strongly prefer not to have to add it to CA directly.

towca avatar Jun 30 '25 13:06 towca

On a separate note - IMO both this and previous CA/Karpenter API alignment attempts (e.g. https://github.com/kubernetes/kubernetes/pull/124800) turned out to be surprisingly difficult. We seem to frequently discuss things in circles, which then either makes the proposal die if it's low priority (like safe-to-evict/do-not-disrupt), or makes it hard to judge alignment progress and plan implementation if it's high priority (like this one).

I wonder what we could change to improve the process, especially given that we're likely going to have increasingly more proposals of similar scope in the future. WDYT about doing some sort of retrospective about this proposal after we're done?

towca avatar Jun 30 '25 14:06 towca

@towca

For the format of the ReplicaBufferCapacity - based on discussion with @liggitt it seems that int or string field is not the easiest to maintain over time so I would prefer to stay with separate fields.

As for the structure. Following a suggestion from @x13n I changed the names: percent to value and ReplicaBasedCapacity from replicas to controllerBased so that we do not have percent.percent or replica.replica fields.

jbtk avatar Jul 01 '25 08:07 jbtk

For the controller discussion there is one more argument. For buffers for objects implementing scale subresource if we go with a library and no pod template in the status we would be able to buffer capacity only if at least one pod is running in the cluster. This is a limitation for people who want buffers for jobs that run only some time during a day.

@ellistarn @towca @jonathan-innis as FYI

jbtk avatar Jul 09 '25 07:07 jbtk

For the controller discussion there is one more argument. For buffers for objects implementing scale subresource if we go with a library and no pod template in the status we would be able to buffer capacity only if at least one pod is running in the cluster. This is a limitation for people who want buffers for jobs that run only some time during a day

I will say this sort-of breaks the "I can rebuild status from the current state of the world" tenant that is typically prescribed to status; though, I will note that this is not a hard and fast rule. Agreed though that this might help with the scale-from-0 scenario; though it might be nice to build in higher-order logic and fallback to the status subresource to help with these scale-from-0 scenarios

jonathan-innis avatar Jul 21 '25 21:07 jonathan-innis

On a separate note - IMO both this and previous CA/Karpenter API alignment attempts (e.g. https://github.com/kubernetes/kubernetes/pull/124800) turned out to be surprisingly difficult

Keeping this thread updated since there were a couple of offline discussions between some CAS maintainers (@towca/@jbtk) and some Karpenter maintainers (@jonathan-innis/@ellistarn). We're still working on alignment. I have hope that we can come to some middle-ground on this because I can see this kind of API surface being utilized heavily be external components (like Kueue at some point) and getting the benefit for both autoscalers rather than having to build in custom support for both seems more ideal.

jonathan-innis avatar Jul 21 '25 21:07 jonathan-innis

@ellistarn @jonathan-innis @jackfrancis @towca @x13n

Hi! Today we discussed with our product team and based on these discussions would like to put up for comments changed design: https://docs.google.com/document/d/1wrPLdWqPRJzRTDPMT7WRYDYYx7L65LNd99qBD6y6U38/edit?tab=t.0

Changes:

  • dropped the resource based config
  • introduced top level resource limits that will work in two ways: will limit the total size of the buffer (ensuring that if someone changes the pod template the buffer does not grow without control out of a sudden) and if replicas numbers are not provided it will be used to calculate how many chunks (from pod template or from pod spec running withing a deployment) will fit into this buffer and should be provided.
  • since we have top level limits we dropped max replicas from the replica based part and changed the semantics so that replicas is treated as min replicas if defined together with percent value (getting rid of another type nesting)
  • changed names including renaming Buffer to CapacityBuffer (currently the winning option in the voting).

The names were discussed between me @towca and @x13n - we are well aware that there are places where these could be better, but we were out of more ideas. I am happy to take comments (especially with better ideas) :)

jbtk avatar Jul 24 '25 12:07 jbtk

@jbtk and I had a conversation this morning -- there's been some more thought put into the API design and use-cases and I'm summarizing some of the decisions that we took:

  1. We settled on CapacityBuffer as the kind naming — we’ll go forward with this in the AEP, it disambiguates what Buffer means
  2. We shifted from having the pod template inlined in either the spec or the status to using the PodTemplate object present in K8s. There’s a couple of reasons for this: First, this comes pre-packaged with validations on the apiserver rather than having to copy over every part of the spec over into our API (this was also recommended from since there’s a push against inlining PodSpec for newer APIs). Second, this allows us to show what the resolved spec is for the buffer without muddying the spec too too much — so you can get visibility if you want into the pod that we are using for modeling, but don’t actually need it in the BufferCapacity Spec
  3. We agreed that ProvisioningStrategy was reasonable to keep since it could have values like Default or WarmPool as ways to handle the capacity in the future when users want to opt for non-active capacity to save money
  4. I proposed two versions of the API that I’m personally comfortable with when it comes to naming, symmetry, and avoiding over-nesting (I’m personally good with either of these options to be the proposed design)

API 1 (separate stanzas for validation for podTemplate and scaleTarget)

kind: CapacityBuffer
spec:
  podTemplate:
    objectRef:
    replicas:
  scaleTarget: # HPA uses same naming 
    objectRef:
      
    percentage:
    replicas:
  provisioningStrategy: <>
  limits:
status:
  conditions:
  podTemplateRef:
    name: 
  podTemplateGeneration:
  replicas:

API 2 (flattened API)

kind: CapacityBuffer
spec:
  podTemplateRef:
    name:
  scaleTargetRef:
    name:
  percentage:
  replicas:
  limits:
   
status:
  conditions:
  podTemplateRef:
    name: 
  podTemplateGeneration:
  replicas:

cc: @DerekFrank

jonathan-innis avatar Jul 25 '25 19:07 jonathan-innis

Updated the proposal to reflect the changes. The field names slightly differ from what @jonathan-innis proposed - please review and comment directly on the pr.

Note: While discussing with @jonathan-innis he noticed that there might be security considerations to reference across namespaces. For now I changed the pod template to be string (name only) within the same namespsce and I am discussing options with @liggitt. Open option - adding a reference grant to allow cross namespace referencing.

jbtk avatar Jul 28 '25 15:07 jbtk

@ellistarn @towca

I changed the names and yaml structure based on the comment from @liggitt (thanks!).

Are there any more things that are blocking this PR from being approved for merge?

jbtk avatar Aug 07 '25 11:08 jbtk

@ellistarn @towca

I changed the names and yaml structure based on the comment from @liggitt (thanks!).

Are there any more things that are blocking this PR from being approved for merge?

Lgtm

ellistarn avatar Aug 07 '25 13:08 ellistarn

The current proposal LGTM. We got an LGTM from Karpenter and Jordan as well. IMO this is good to merge as soon as the remaining nits from me and Jordan are addressed.

/lgtm

towca avatar Aug 11 '25 13:08 towca

I added some additional nits, but overall I'm happy with the direction of the API that represents a consensus between @ellistarn @liggitt @towca (and others). lgtm after incorporation of that last mile feedback

jackfrancis avatar Aug 11 '25 15:08 jackfrancis

Holding for the one remaining discussion: https://github.com/kubernetes/autoscaler/pull/8151#discussion_r2267037122. Feel free to unhold after the discussion is finalized.

/lgtm /approve /hold

towca avatar Aug 12 '25 12:08 towca

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jbtk, towca

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Aug 12 '25 12:08 k8s-ci-robot

/unhold

jbtk avatar Aug 12 '25 14:08 jbtk