autoscaler icon indicating copy to clipboard operation
autoscaler copied to clipboard

helm: enable clusterapi namespace autodiscovery

Open jackfrancis opened this issue 2 years ago • 16 comments

What type of PR is this?

What this PR does / why we need it:

This PR enables usage of the clusterapi provider's namespace node autodiscovery configuration in the helm chart, by filtering for Machines by Cluster namespace (Machine and Cluster are Cluster API CRDs that represent a node and a Kubernetes cluster, respectively).

I've reworked the combinatoric helm template foo a bit so hopefully it's a bit easier to wrangle going forward (I think there's no way to avoid it being sort of horrible).

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


jackfrancis avatar May 17 '23 00:05 jackfrancis

/assign @elmiko for a first pass to make sure I'm doing something sane here

jackfrancis avatar May 17 '23 00:05 jackfrancis

thanks @jackfrancis , just starting to take a look

elmiko avatar May 17 '23 14:05 elmiko

On the other hand maybe just passing in clusterNamespace in the auto discovery config for Cluster API doesn't work like I think it does?

On a cluster I have running right now:

--node-group-auto-discovery="[clusterapi:clusterNamespace=capz-e2e-3dyctz]"

I've got a cluster in that namespace, and its MachineDeployment is properly annotated:

$ k get clusters -n capz-e2e-3dyctz
NAME                 PHASE         AGE   VERSION
capz-e2e-3dyctz-ha   Provisioned   22m   
$ k get machinedeployment/capz-e2e-3dyctz-ha-md-0 -n capz-e2e-3dyctz -o yaml | grep 'autoscaler'
    cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size: "500"
    cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "1"

However, the cluster-autoscaler logs suggest that the machines in the above MachineDeployment aren't in a discovered node group:

I0517 18:24:34.147506       1 pre_filtering_processor.go:57] Node capz-e2e-3dyctz-ha-md-0-qrz8p should not be processed by cluster autoscaler (no node group config)
I0517 18:24:34.147541       1 pre_filtering_processor.go:57] Node capz-e2e-3dyctz-ha-md-0-t7qm5 should not be processed by cluster autoscaler (no node group config)

If it's not clear, I was expecting clusterNamespace to be a 1st class auto discovery configuration, like "add all clusters that match this namespace query". Is that incorrect?

Thanks!

jackfrancis avatar May 17 '23 18:05 jackfrancis

Nevermind! I was using the wrong clusterapi key for --node-group-auto-discovery, should be namespace= not clusterNamespace=

jackfrancis avatar May 17 '23 19:05 jackfrancis

not sure why the helm docs CI is complaining (I did make changes there but I think that should be allowed)

jackfrancis avatar May 23 '23 22:05 jackfrancis

Ah, that's something I've been meaning to update the docs on as it keeps on catching people out, the helm-docs action generates the README.md from the README.md.gotmpl, so changes to the text portions need to be made there.

gjtempleton avatar May 23 '23 22:05 gjtempleton

@gjtempleton thanks for the pro tip, done!

jackfrancis avatar May 23 '23 23:05 jackfrancis

@gjtempleton I wonder if this change in my PR that merged yesterday (which passed CI) is throwing a wrench in this PR?

  • https://github.com/kubernetes/autoscaler/pull/5768/files#diff-d0c14bc33196b8f56c06b62d79b3f92cd75f94279dedecb4f3982493eafa9054

jackfrancis avatar May 23 '23 23:05 jackfrancis

Please rebase the PR.

mwielgus avatar Jun 26 '23 12:06 mwielgus

@mwielgus rebased, thx for the nudge

@gjtempleton can't seem to figure out how to make helm-docs happy on this one...

jackfrancis avatar Jun 27 '23 00:06 jackfrancis

@jackfrancis the fact the README.md hasn't been regenerated with new variable in the values file included in it is causing the failing test:

$ helm-docs charts/. && git diff --unified=0 | cat
INFO[2023-06-28T17:44:19+01:00] Found Chart directories [charts/cluster-autoscaler]
INFO[2023-06-28T17:44:19+01:00] Generating README Documentation for chart charts/cluster-autoscaler
diff --git a/charts/cluster-autoscaler/README.md b/charts/cluster-autoscaler/README.md
index cc18f7920..b054affee 100644
--- a/charts/cluster-autoscaler/README.md
+++ b/charts/cluster-autoscaler/README.md
@@ -347,0 +348 @@ vpa:
+| autoDiscovery.namespace | string | `nil` |  |

gjtempleton avatar Jun 28 '23 16:06 gjtempleton

@gjtempleton brew install norwoodj/tap/helm-docs && helm-docs did the trick :)

jackfrancis avatar Jun 28 '23 18:06 jackfrancis

@jackfrancis happy to help get this merged, is a rebase and review all that's needed?

elmiko avatar Nov 16 '23 14:11 elmiko

@jackfrancis, Please rebase the PR so that it will merge.

Shubham82 avatar Jan 16 '24 08:01 Shubham82

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jackfrancis, mwielgus

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Mar 18 '24 11:03 k8s-ci-robot