kops
kops copied to clipboard
IG: "kops.k8s.io/instancegroup" property missing under "nodeLabels" for instance groups created via "kops create cluster" command
/kind bug
1. What kops
version are you running? The command kops version
, will display
this information.
Tested with Client version: 1.28.4 (git-v1.28.4)
and Client version: 1.27.3 (git-v1.27.3)
2. What Kubernetes version are you running? kubectl version
will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops
flag.
N/A
3. What cloud provider are you using? AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
$ kops create cluster --cloud=aws --dns=private --zones=us-west-2a --name kops.example.com --dry-run -o yaml
# [...]
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: null
labels:
kops.k8s.io/cluster: kops.example.com
name: control-plane-us-west-2a
spec:
image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20240126
machineType: t3.medium
maxSize: 1
minSize: 1
role: Master
subnets:
- us-west-2a
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: null
labels:
kops.k8s.io/cluster: kops.example.com
name: nodes-us-west-2a
spec:
image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20240126
machineType: t3.medium
maxSize: 1
minSize: 1
role: Node
subnets:
- us-west-2a
$ kops --name kops.example.com create instancegroup zzz --dry-run -oyaml
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: null
labels:
kops.k8s.io/cluster: kops.example.com
name: zzz
spec:
image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20240126
kubelet:
anonymousAuth: false
nodeLabels:
node-role.kubernetes.io/node: ""
machineType: t3.medium
manager: CloudGroup
maxSize: 2
minSize: 2
nodeLabels:
kops.k8s.io/instancegroup: zzz
role: Node
subnets:
- us-west-2a
5. What happened after the commands executed?
Check Answer 4.
6. What did you expect to happen?
I would expect that all properties, specially kops.k8s.io/instancegroup
under nodeLabels
, to also be created when using kops create cluster
command, the same way kops create instancegroup
does.
The whole kubelet
property is missing when creating cluster, so maybe the ideal would be to have all "default" properties aligned between create cluster and create instancegroup commands.
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
Check Answer 4.
8. Please run the commands with most verbose logging by adding the -v 10
flag.
Paste the logs into this report, or in a gist and provide the gist link here.
9. Anything else do we need to know?
I have many existing clusters (k8s v1.27 which have been upgraded many times) with the kops.k8s.io/instancegroup
node label set, so this may have been working before, or it may have been set as part of a previous kops upgrade.
Is this a bug?
Without too much understanding of the codebase design, I notice there's a fallback to "node" role type. If I am not off-track, this is more of an enhancement for a null check anti-pattern due to a hotspot in the codebase, so perhaps nothing to worry about.
@teocns Not sure if I understand your comment but the issue is not related with the actual node type, that works just fine.
The issue is that the label (at node level) which identifies the KOPs instance group for each specific node is missing when using the kops create cluster
command - we can manually add it afterwards but I wouldn't expect to have to do that, specially because when you create a new instance group the nodeLabels
is automatically injected (as described in point 4), and also because this was "working" at some point before (as I have nodes from clusters created with older KOPs versions contain the label).
For our environments that was a breaking change (and we had to manually update+rollout the IGs) because we actively use things like kubernetes anti/affinity rules and prometheus metrics which rely in the value of the kops.k8s.io/instancegroup
node label.
The actual yaml/property that I would expect to be present for each IG created via kops create cluster
command is 👇
nodeLabels:
kops.k8s.io/instancegroup: <IG_NAME>
Gootcha, you rely on the label as an affinity selector within your own workflow, while my observation was oriented more towards kops' own functional integrity. Thanks for clarifying -
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.