k0sctl
k0sctl copied to clipboard
CoreOS support
Hey folks, I was wondering if you plan to support coreos (Fedora & Rhel). Any idea what would it take to add this?
As of now coreos is discovered as Fedora, k0sctl then tries to install stuff using yum which is not present in coreos, so it fails.
Thanks
So I have done some more digging and it seems, there should be distinction in OS detection. That means if the OS is detected as coreos, it shouldn't try to install kubectl using yum, but fall back to curl method. (alternatively use rpm-ostree, but the node needs to be rebooted for the changes to take effect, so some some kind of reboot & wait until the node is back up function would be needed.)
@kaplan-michael to add support for CoreOS, maybe have a look at how Alpine support is implemented: https://github.com/k0sproject/k0sctl/blob/main/configurer/linux/alpine.go
Alpine does have package mgmt, thus for CoreOS you'd need some way to pick up the package from
func (l Alpine) InstallPackage(h os.Host, pkg ...string) error {
and use proper download url etc.
So I had a look at it, for it to be properly implemented, rig library would have to provide more checks. As core os has the ID of the system it is based on.(ID=Fedora). They are distinguished by VARIANT_ID or VARIANT. so rig would ideally have to parse those too. @jnummelin do you have any guidance about it? Thanks in advance.
@kaplan-michael yeah, sounds like we need more info to rig OSVersion
struct: https://github.com/k0sproject/rig/blob/main/osversion.go
The parsing of it happens at https://github.com/k0sproject/rig/blob/main/resolver.go#L110, should not be too big effort to make it understand more fields.
I do not have a CoreOS box at my hands, could you share an example of the /etc/os-release
file it has?
I'll have a look at it over the weekend.
Here you have the example of the one I have on hand.
NAME=Fedora
VERSION="33.20210301.3.1 (CoreOS)"
ID=fedora
VERSION_ID=33
VERSION_CODENAME=""
PLATFORM_ID="platform:f33"
PRETTY_NAME="Fedora CoreOS 33.20210301.3.1"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:33"
HOME_URL="https://getfedora.org/coreos/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora-coreos/"
SUPPORT_URL="https://github.com/coreos/fedora-coreos-tracker/"
BUG_REPORT_URL="https://github.com/coreos/fedora-coreos-tracker/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=33
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=33
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="CoreOS"
VARIANT_ID=coreos
OSTREE_VERSION='33.20210301.3.1'
Rig already parses the PRETTY-NAME field so a specific OS implementation could use some 'ID equals 'Fedora' and name contains 'CoreOS' condition check for detection without needing additions to rig. Not that I'm against enhancing it. :)
That sounds good, although I think the variant fields is more reliable source. It would be great to enhance rig in the future.
OK, I can reliably install package using curl. I would like to implement it using rpm-ostree, as it will keep the packages updated, but the issue I'm running into there is the need of rebooting. That will break the connection. I'm not sure if there could be easily implemented. I'm wasn't able to find if rig can reconnect after the connection was lost? Similarly as Ansible does with reboot. Where it waits until the host is back up and reconnects. If not, I'll send a PR with curl only for now.
So here is the commit be8f6cf, I would appreciate if you could have a look at it, I'll test it more before I send the PR.
BTW: I still have issues with etcd user being ignored in my configs. Intention is to run etcd under "k0s-etcd" instead of "etcd" I can share the config if it helps, but it mostly taken just from examples.
@kaplan-michael in general the work in linked commit looks pretty good.
I'm wasn't able to find if rig can reconnect after the connection was lost?
IIRC rig does not do any "auto-reconnect" but not 100% sure. Maybe we can tackle that at k0sctl level, so once we know the reboot is happening we can tell rig to reconnect. Something like (pseudo-ish code):
func (l Coreos) InstallPackage(h os.Host, pkg ...string) error {
err := h.Execf("sudo rpm-ostree install %s --reboot", strings.Join(pkg, " "))
if err != nil { return err }
return h.Connect()
}
@kke might have some better ideas for the reconnect part :)
This is from another project using rig:
// Reconnect disconnects and reconnects the host's connection
func (h *Host) Reconnect() error {
h.Disconnect()
log.Infof("%s: waiting for reconnection", h)
return retry.Do(
func() error {
return h.Connect()
},
retry.DelayType(retry.CombineDelay(retry.FixedDelay, retry.RandomDelay)),
retry.MaxJitter(time.Second*2),
retry.Delay(time.Second*3),
retry.Attempts(60),
)
}
// Reboot reboots the host and waits for it to become responsive
func (h *Host) Reboot() error {
log.Infof("%s: rebooting", h)
if err := h.Configurer.Reboot(h); err != nil {
return err
}
log.Infof("%s: waiting for host to go offline", h)
if err := h.waitForHost(false); err != nil {
return err
}
h.Disconnect()
log.Infof("%s: waiting for reconnection", h)
if err := h.Reconnect(); err != nil {
return fmt.Errorf("unable to reconnect after reboot")
}
log.Infof("%s: waiting for host to become active", h)
if err := h.waitForHost(true); err != nil {
return err
}
if err := h.Reconnect(); err != nil {
return fmt.Errorf("unable to reconnect after reboot: %s", err.Error())
}
return nil
}
// when state is true wait for host to become active, when state is false, wait for connection to go down
func (h *Host) waitForHost(state bool) error {
err := retry.Do(
func() error {
err := h.Exec("echo")
if !state && err == nil {
return fmt.Errorf("still online")
} else if state && err != nil {
return fmt.Errorf("still offline")
}
return nil
},
retry.DelayType(retry.CombineDelay(retry.FixedDelay, retry.RandomDelay)),
retry.MaxJitter(time.Second*2),
retry.Delay(time.Second*3),
retry.Attempts(60),
)
if err != nil {
return fmt.Errorf("failed to wait for host to go offline")
}
return nil
}
So just to update, I still didn't managed to find the time to do it.
Ok, so to update, kubectl is in a separate rpm repo. and the reboot and reconnect (similar as @kke mentioned) would require some type trickery for which I don't have the time. I'll send a PR only for the curl variant.
BTW: you would probably be better of, building a custom image of coreos when installing.
Kubectl is already embedded in k0s nowadays, so there's no need to install it anymore. It is used by k0sctl like k0s kubectl get nodes
.
Only the smoke-tests use a real kubectl but that is only used on the host running the tests.
ok, I don't have much time to fiddle with it more. the PR should just add the definition for coreos as supported system.