tectonic-installer
tectonic-installer copied to clipboard
Make kubelet node labels a key=value pair
FEATURE REQUEST
- As part of https://github.com/kubernetes/kubernetes/pull/39112 the tectonic installer is setting the node labels without value
node-role.kubernetes.io/master=""
this leads to an un-usable prometheus service discovery.
Why is un-usable: There isn't an easy way to filter controller nodes vs worker nodes, we need to scrape etcd that lives in the same nodes as the control plane.
On tectonic 1.5.x there was a master / worker labels set to true, and made k8s sd super easy.
Versions
- Tectonic version (release or commit hash): 1.6.4 / 1.6.7
- Platform (aws|azure|openstack|metal): metal/aws
What you expected to happen?
Being able to easily filter the worker nodes and controllers.
How to reproduce it (as minimally and precisely as possible)?
Deploy a prometheus, using the k8s node SD try to filter only controllers or only workers.
@joshrosso
xref: https://github.com/coreos/tectonic-installer/issues/667
Somewhat related, could give us more flexibility in node labeling via the installer and, by side-effect, provide a workaround until we solve this 'the right way'.
/cc @brancz
Yes as I was troubleshooting this yesterday, I looked at Prometheus code and realized that empty label value is basically equivalent to no label. So this is indeed the kind of change we need.
@brancz playing devils advocate for a sec, why not change Prometheus, as a value-less label query is a valid thing in Kubernetes.
To be honest, I'm not sure how much effort it would be upstream. In my opinion we should support it, just a question of how breaking this is and how much effort.
This would be quite painful to change at this point in k8s... and I guess there are other uses for valueless labels in Prom as well
@brancz what was the conclusion on the Prometheus side? In my testing, Prometheus 2.1 doesn't distinguish between non-existent key-value pairs and pairs with an "" value (need to check Prom 2.2).
Given node-role.kubernetes.io/master=""
can't be undone, I'm considering of adding another label (node-role.kubernetes.io/controller="true"
) to masters (i.e. controllers) in the Typhoon distro to unblock discovery work. Label not final. https://github.com/poseidon/typhoon/pull/160
Changing this behavior would be too breaking, so it's not going to be changed. Any label that has a value would work, your suggestion sounds good to me.
Ultimately however, I don't believe etcd should be monitored by relabeling (relatively random) node information, best would be if etcd were self hosted or at least static pods, that way the pods could be discovered like any other workload on Kubernetes.
Yeah, agreed, it would be ideal if the cluster's etcd was self-hosted and discovered as a Kubernetes service. Unfortunately, for now, that work stalled out due to difficulties and bootkube isn't targeting it anymore, so I'm guessing it will be a long time before that's viable.
Yes, I just wanted to make sure everyone is aware that those things are workarounds at best. Any reason why static pods for etcd are not viable (you could have a node pool specifically for etcd)?
X-ref: https://github.com/prometheus/prometheus/issues/3726#issuecomment-365337866
This is still a pain point for resource planning, especially if we don't want to add random labels that carry no purpose besides working around a monitoring limitation.
Unfortunately, for now, that work stalled out due to difficulties and bootkube isn't targeting it anymore, so I'm guessing it will be a long time before that's viable.
Oh? I thought everyone moves towards self-hosted?
Either way, would make sense to have a role label with the role as key either way. But the current labels are pretty much 'standardized' and I guess changing them would break lot of things. But adding a new label would make sense I guess.
Just so there is no misunderstanding (for readers who stumble on this without context), bootkube does self-host the control plane and others are aiming to do the same. Self-hosting etcd itself was a separate and much harder challenge that bootkube is no longer working on - you bring your own etcd when using bootkube.