(Experimental) bare-metal with IPv6
IPv6 brings some new complexities, particularly around IPAM.
We create a test and then fix a few things:
- We need to assign the podCIDR for IPv6, so we add support to kops-controller. The source of this information is the host CRD.
- Because we are assigning the podCIDR from the Host CRD, we need Host records for the control plane nodes. However, there are bootstrapping problems around creating a CRD during enrollment of the control-plane nodes. So instead, we can now generate a Host Object in yaml, and can apply it separately. A high-security workflow would probably create the host records separately anyway, because they are how we validate nodes.
- Previously we were always setting the kubelet cloud-provider=external flag. But this assumes we are running a CCM. If we are not running a CCM (like metal), then we should not set the flag. If we do set the flag, kubelet sets the
node.kops.k8s.io/uninitializedtaint for CCM to clear, and nobody clears it. - We need to make sure there is an IPv6 default route so that kubelet can discover its node ip correctly. We could put this into the Host CRD, but it does seem like most nodes will have a default route.
I am trying to upload this and then I can rebasing as I/we fix each problem.
Current problem is from nodeup:
vm0 nodeup[703]: W1111 17:07:00.322041 703 main.go:133] got error running nodeup (will retry in 30s): error building loader: building *model.PrefixBuilder: kOps IPAM controller not supported on cloud "metal"
So we need to decide how the podCIDR is assigned!
/retest
/retest
/retest
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle rotten - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
cc @hakman
I think this is now uncontroversial (I hope). We assign podCIDRs to nodes if they are configured on the Host object. If users don't want to do that, they just don't set podCIDRs on the Host object.
cc @hakman
I think this is now uncontroversial (I hope). We assign podCIDRs to nodes if they are configured on the Host object. If users don't want to do that, they just don't set podCIDRs on the Host object.
Cool, I will take a look soon. 🚀
/retest
/test all
/hold in case you want to update the other APIs (could also be a separate PR). /lgtm /approve
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: hakman
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [hakman]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/test pull-kops-e2e-k8s-aws-amazonvpc
/test pull-kops-e2e-k8s-gce-cilium
/test pull-kops-e2e-k8s-aws-calico
/test pull-kops-e2e-k8s-aws-amazonvpc
/hold cancel
I propose adding a round-trip test alongside fixing the missing field in v1alpha3