k0s
k0s copied to clipboard
RISC-V support
Is your feature request related to a problem? Please describe.
The RISC-V ISA is getting more traction as real hardware is starting to appear. This includes smaller SBC's which don't have that many resources (i.e. 1 CPU core and 1 GB of RAM) so k8s would be a great fit to try out kubernetes on this hardware.
Describe the solution you would like
k0s available not only for amd64 and arm64 but for riscv64 as well
Describe alternatives you've considered
Currently, I do not see any kubernetes distros supporting RISC-V. Podman/containerd are available on both Ubuntu and Debian and work just fine - I was able to pull containers from Docker Hub and run them.
Additional context
Software supporting this ISA: Debian: http://ftp.ports.debian.org/debian-ports/pool-riscv64/main/ Ubuntu: http://ports.ubuntu.com/ubuntu-ports/pool/ Alpine: https://dl-cdn.alpinelinux.org/alpine/edge/main/riscv64/ Images on Docker Hub: https://hub.docker.com/u/riscv64
@jekader thank you for this issue. I've added it to our backlog candidates.
Additional interesting resources:
- https://github.com/carlosedp/riscv-bringup/
- https://carlosedp.medium.com/docker-containers-on-risc-v-architecture-5bc45725624b
- https://riscv.org/wp-content/uploads/2019/12/12.10-14.50c-The-RISC-V-Journey-Through-Containers-to-the-Cloud.pdf
Does the other upstream components support riscv? Like etcd, kine, kube-router, calico?
Does github actions support riscv64 or do we need to cross compile? If we need to cross compile, how do we run unit and integration tests?
Is there precedent for having k0s support an arch for a worker, but not a controller?
Project | Supported? | Relevant Links |
---|---|---|
etcd | not supported | https://github.com/etcd-io/etcd/issues/14522 https://github.com/etcd-io/etcd/pull/14517 |
kine | ? | No issue or PR |
kube-router | merged, not yet released | https://github.com/cloudnativelabs/kube-router/pull/1525 |
calico | not supported | no issue or PR |
containerd | supported | since 1.6.7/2022-08-04 |
runc | supported | since 1.1.8/2023-07-19 |
There are probably others that you didn't mention. I'll try to look into them as I figure out which ones.
Is there precedent for having k0s support an arch for a worker, but not a controller?
This is the case for Windows. For Windows, k0s only ships kubelet and kube-proxy, relying on an external CRI and some manual shenanigans for the Calico setup. Although there's currently some ongoing work to bundle containerd also for Windows and to streamline the Calico support.
What's the status of containerd for RISC-V? I think that'd be very important for a RISC-V based k0s worker.
Updated the table above. Both containerd and runc have supported riscv64 for multiple releases.
I guess one of the main challenges would be to have a CI end-to-end test case for RISC-V. The current arm64 & armv7 are "easy" as we have dedicated runners for both of those archs. I don't think we can do that for RISC-V and thus we'd need to figure out something else. What that something else could be, I've got no idea :D
I was able to build something: https://github.com/twz123/k0s/releases/tag/v1.28.2%2Bk0sriscv64.0
Feel free to give it a shot. I didn't do any deeper verifications on this despite executing the integration test suite. Basic clusters should work. Maybe somebody wants to do some further tests and share any results?
Is there any easy way to test this with a k0sctl
cluster? I've managed to get past getting this version uploaded and then get the below error. I can't easily change the version because all the other nodes are running the "real" version 1.28.2.
uploaded k0s binary version is v1.28.2+k0sriscv64.0 not v1.28.2+k0s.0
@iggy There's also an amd64 binary available for download. If it's not a production cluster, and if the other nodes are amd64, you can deploy the RISC-V version on all of them, which should get you past the k0sctl version check. Otherwise, I don't know of any other way around it. In that case, you'd have to join the RISC-V node manually by creating a join token and adding the worker to the cluster.
On amd64, the RISC-V enabled version should work just as well as the vanilla 1.28.2+k0s.0 release, so I don't see any blockers to using it for the whole cluster. Please also note that if you're not running the RISC-V enabled version on the controller nodes, you need to add the custom OCI images to your k0s config. Otherwise all the necessary pods won't be able to run on RISC-V. You can run ./k0s-v1.28.2+k0sriscv64.0-riscv64 config create --include-images
and grab the images snippet from there:
apiVersion: k0s.k0sproject.io/v1beta1
kind: ClusterConfig
metadata:
name: k0s
spec:
images:
coredns:
image: quay.io/twz123/coredns
version: 1.11.1-1@sha256:ef304af35da98ff9f1af445b103b3fd73221ffaddb802be152cd0c488ec19699
konnectivity:
image: quay.io/twz123/apiserver-network-proxy-agent
version: v0.1.4-1@sha256:66a0ce4a1b7f98ea74510d30c1e96d80846c9a233b9c6eb30143d32209e127a3
kubeproxy:
image: quay.io/twz123/kube-proxy
version: v1.28.2-1@sha256:77dbd9bb0b9ee748b4d39f0e998076cd1269ae097b482fd58f54dea56906efe1
kuberouter:
cni:
image: quay.io/twz123/kube-router
version: v1.6.0-iptables1.8.9-1@sha256:7ddbda29726da778945274ede6ff530351c6075695779777486d6ecc5ce8ea58
cniInstaller:
image: quay.io/twz123/cni-node
version: 1.3.0-k0s.1@sha256:c08c83a7388bd3d92637846603d7065871b2c7e59f4a0de1e701c1045a1215ea
metricsserver:
image: quay.io/twz123/metrics-server
version: v0.6.4-1@sha256:ee0d5a55b6724d4a955aaa5357d655092f4e4d1458f92e1b5d79d9fd127073d0
pause:
image: quay.io/twz123/pause
version: 3.9-1@sha256:266cc1ad730c2a1adc10a60e7b6216ad13095ec5b0329336a557141306ec4625
Got it. Sounds like I should just do a new cluster. My current cluster is a mix of x86_64 and aarch64 nodes. I have 2 RISC-V boards I could do a small cluster with just them.
I'll have a look if I can compile the arm32/64 binaries as well. Then you could re-use the existing cluster.
@iggy I've added the arm64 build. Now you can try a mixed-arch cluster.
As a note for anyone who runs into this, when using the uploadBinary
function, you have to have bash installed on the target nodes. The error message you get back from k0sctl
is not super useful.
upload k0s binary: invalid path: open remote file /tmp/tmp.91elGRnKL6 for writing: command failed: failed to execute helper: command failed: client exec: command failed: write stdin: EOF
upload k0s binary: invalid path: open remote file /tmp/tmp.zUV1nqbGj7 for writing: command failed: failed to execute helper: command failed: client exec: ssh session wait: Process exited with status 127
The pause
container doesn't support riscv64. I'm trying to track down what would be involved in adding that.
Sorry, just noticed you have a pause
image at the bottom of your list above. It looks like something is trying to pull the upstream pause
image. These errors repeat continually and k0sctl eventually gives up on adding the node.
time="2023-10-12 00:04:24" level=info msg="time=\"2023-10-12T00:04:24.779571495Z\" level=info msg=\"RunPodSandbox for &PodSandboxMetadata{Name:kube-proxy-b8jcq,Uid:84b204d7-10c3-4a07-a03a-5df6595b3aa9,Namespace:kube-system,Attempt:0,}\"" component=containerd stream=stderr
time="2023-10-12 00:04:25" level=info msg="time=\"2023-10-12T00:04:25.403288008Z\" level=info msg=\"stop pulling image registry.k8s.io/pause:3.8: active requests=0, bytes read=2945\"" component=containerd stream=stderr
time="2023-10-12 00:04:25" level=info msg="time=\"2023-10-12T00:04:25.403829019Z\" level=error msg=\"RunPodSandbox for &PodSandboxMetadata{Name:kube-proxy-b8jcq,Uid:84b204d7-10c3-4a07-a03a-5df6595b3aa9,Namespace:kube-system,Attempt:0,} failed, error\" error=\"rpc error: code = NotFound desc = failed to get sandbox image \\\"registry.k8s.io/pause:3.8\\\": failed to pull image \
\\"registry.k8s.io/pause:3.8\\\": failed to pull and unpack image \\\"registry.k8s.io/pause:3.8\\\": no match for platform in manifest: not found\"" component=containerd stream=stderr
time="2023-10-12 00:04:25" level=info msg="E1012 00:04:25.407988 2529 remote_runtime.go:193] \"RunPodSandbox from runtime service failed\" err=\"rpc error: code = NotFound desc = failed to get sandbox image \\\"registry.k8s.io/pause:3.8\\\": failed to pull image \\\"registry.k8s.io/pause:3.8\\\": failed to pull and unpack image \\\"registry.k8s.io/pause:3.8\\\": no match fo
r platform in manifest: not found\"" component=kubelet stream=stderr
time="2023-10-12 00:04:25" level=info msg="E1012 00:04:25.410562 2529 kuberuntime_sandbox.go:72] \"Failed to create sandbox for pod\" err=\"rpc error: code = NotFound desc = failed to get sandbox image \\\"registry.k8s.io/pause:3.8\\\": failed to pull image \\\"registry.k8s.io/pause:3.8\\\": failed to pull and unpack image \\\"registry.k8s.io/pause:3.8\\\": no match for pla
tform in manifest: not found\" pod=\"kube-system/kube-proxy-b8jcq\"" component=kubelet stream=stderr
time="2023-10-12 00:04:25" level=info msg="E1012 00:04:25.412050 2529 kuberuntime_manager.go:1166] \"CreatePodSandbox for pod failed\" err=\"rpc error: code = NotFound desc = failed to get sandbox image \\\"registry.k8s.io/pause:3.8\\\": failed to pull image \\\"registry.k8s.io/pause:3.8\\\": failed to pull and unpack image \\\"registry.k8s.io/pause:3.8\\\": no match for pl
atform in manifest: not found\" pod=\"kube-system/kube-proxy-b8jcq\"" component=kubelet stream=stderr
time="2023-10-12 00:04:25" level=info msg="E1012 00:04:25.414264 2529 pod_workers.go:1300] \"Error syncing pod, skipping\" err=\"failed to \\\"CreatePodSandbox\\\" for \\\"kube-proxy-b8jcq_kube-system(84b204d7-10c3-4a07-a03a-5df6595b3aa9)\\\" with CreatePodSandboxError: \\\"Failed to create sandbox for pod \\\\\\\"kube-proxy-b8jcq_kube-system(84b204d7-10c3-4a07-a03a-5df6595
b3aa9)\\\\\\\": rpc error: code = NotFound desc = failed to get sandbox image \\\\\\\"registry.k8s.io/pause:3.8\\\\\\\": failed to pull image \\\\\\\"registry.k8s.io/pause:3.8\\\\\\\": failed to pull and unpack image \\\\\\\"registry.k8s.io/pause:3.8\\\\\\\": no match for platform in manifest: not found\\\"\" pod=\"kube-system/kube-proxy-b8jcq\" podUID=\"84b204d7-10c3-4a07-a03
a-5df6595b3aa9\"" component=kubelet stream=stderr
time="2023-10-12 00:04:26" level=info msg="E1012 00:04:26.772136 2529 pod_workers.go:1300] \"Error syncing pod, skipping\" err=\"network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized\" pod=\"kube-system/konnectivity-agent-txcpj\" podUID=\"7b8657b2-3d21-4724-
8a2e-14c92a8d8880\"" component=kubelet stream=stderr
time="2023-10-12 00:04:27" level=info msg="E1012 00:04:27.774628 2529 kubelet.go:2855] \"Container runtime network not ready\" networkReady=\"NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized\"" component=kubelet stream=stderr
time="2023-10-12 00:04:28" level=info msg="E1012 00:04:28.772278 2529 pod_workers.go:1300] \"Error syncing pod, skipping\" err=\"network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized\" pod=\"kube-system/konnectivity-agent-txcpj\" podUID=\"7b8657b2-3d21-4724-
8a2e-14c92a8d8880\"" component=kubelet stream=stderr
time="2023-10-12 00:04:29" level=info msg="time=\"2023-10-12T00:04:29.779123051Z\" level=info msg=\"RunPodSandbox for &PodSandboxMetadata{Name:kube-router-xmbhl,Uid:898311a4-5079-427a-a3dc-dd1db5b2607c,Namespace:kube-system,Attempt:0,}\"" component=containerd stream=stderr
time="2023-10-12 00:04:30" level=info msg="time=\"2023-10-12T00:04:30.479965423Z\" level=info msg=\"stop pulling image registry.k8s.io/pause:3.8: active requests=0, bytes read=2945\"" component=containerd stream=stderr
time="2023-10-12 00:04:30" level=info msg="time=\"2023-10-12T00:04:30.480281679Z\" level=error msg=\"RunPodSandbox for &PodSandboxMetadata{Name:kube-router-xmbhl,Uid:898311a4-5079-427a-a3dc-dd1db5b2607c,Namespace:kube-system,Attempt:0,} failed, error\" error=\"rpc error: code = NotFound desc = failed to get sandbox image \\\"registry.k8s.io/pause:3.8\\\": failed to pull image
\\\"registry.k8s.io/pause:3.8\\\": failed to pull and unpack image \\\"registry.k8s.io/pause:3.8\\\": no match for platform in manifest: not found\"" component=containerd stream=stderr
time="2023-10-12 00:04:30" level=info msg="E1012 00:04:30.485103 2529 remote_runtime.go:193] \"RunPodSandbox from runtime service failed\" err=\"rpc error: code = NotFound desc = failed to get sandbox image \\\"registry.k8s.io/pause:3.8\\\": failed to pull image \\\"registry.k8s.io/pause:3.8\\\": failed to pull and unpack image \\\"registry.k8s.io/pause:3.8\\\": no match fo
r platform in manifest: not found\"" component=kubelet stream=stderr
time="2023-10-12 00:04:30" level=info msg="E1012 00:04:30.485999 2529 kuberuntime_sandbox.go:72] \"Failed to create sandbox for pod\" err=\"rpc error: code = NotFound desc = failed to get sandbox image \\\"registry.k8s.io/pause:3.8\\\": failed to pull image \\\"registry.k8s.io/pause:3.8\\\": failed to pull and unpack image \\\"registry.k8s.io/pause:3.8\\\": no match for pla
tform in manifest: not found\" pod=\"kube-system/kube-router-xmbhl\"" component=kubelet stream=stderr
time="2023-10-12 00:04:30" level=info msg="E1012 00:04:30.486515 2529 kuberuntime_manager.go:1166] \"CreatePodSandbox for pod failed\" err=\"rpc error: code = NotFound desc = failed to get sandbox image \\\"registry.k8s.io/pause:3.8\\\": failed to pull image \\\"registry.k8s.io/pause:3.8\\\": failed to pull and unpack image \\\"registry.k8s.io/pause:3.8\\\": no match for pl
atform in manifest: not found\" pod=\"kube-system/kube-router-xmbhl\"" component=kubelet stream=stderr
time="2023-10-12 00:04:30" level=info msg="E1012 00:04:30.487505 2529 pod_workers.go:1300] \"Error syncing pod, skipping\" err=\"failed to \\\"CreatePodSandbox\\\" for \\\"kube-router-xmbhl_kube-system(898311a4-5079-427a-a3dc-dd1db5b2607c)\\\" with CreatePodSandboxError: \\\"Failed to create sandbox for pod \\\\\\\"kube-router-xmbhl_kube-system(898311a4-5079-427a-a3dc-dd1db
5b2607c)\\\\\\\": rpc error: code = NotFound desc = failed to get sandbox image \\\\\\\"registry.k8s.io/pause:3.8\\\\\\\": failed to pull image \\\\\\\"registry.k8s.io/pause:3.8\\\\\\\": failed to pull and unpack image \\\\\\\"registry.k8s.io/pause:3.8\\\\\\\": no match for platform in manifest: not found\\\"\" pod=\"kube-system/kube-router-xmbhl\" podUID=\"898311a4-5079-427a-
a3dc-dd1db5b2607c\"" component=kubelet stream=stderr
time="2023-10-12 00:04:30" level=info msg="E1012 00:04:30.770413 2529 pod_workers.go:1300] \"Error syncing pod, skipping\" err=\"network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized\" pod=\"kube-system/konnectivity-agent-txcpj\" podUID=\"7b8657b2-3d21-4724-
8a2e-14c92a8d8880\"" component=kubelet stream=stderr
time="2023-10-12 00:04:32" level=info msg="E1012 00:04:32.772414 2529 pod_workers.go:1300] \"Error syncing pod, skipping\" err=\"network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized\" pod=\"kube-system/konnectivity-agent-txcpj\" podUID=\"7b8657b2-3d21-4724-
8a2e-14c92a8d8880\"" component=kubelet stream=stderr
time="2023-10-12 00:04:32" level=info msg="E1012 00:04:32.781193 2529 kubelet.go:2855] \"Container runtime network not ready\" networkReady=\"NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized\"" component=kubelet stream=stderr
time="2023-10-12 00:04:34" level=info msg="E1012 00:04:34.771921 2529 pod_workers.go:1300] \"Error syncing pod, skipping\" err=\"network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized\" pod=\"kube-system/konnectivity-agent-txcpj\" podUID=\"7b8657b2-3d21-4724-
8a2e-14c92a8d8880\"" component=kubelet stream=stderr
More digging leads to this:
# grep pause /run/k0s/containerd-cri.toml
sandbox_image = "registry.k8s.io/pause:3.8"
I tried changing that and /var/lib/k0s/worker-profile.yaml
to point to your pause image, but they keep resetting back.
I tried to add a block to the k0sctl.yaml to specify the image and version. It continues to use the upstream for some reason. Not sure what to try next, but I'll poke at it some more tomorrow.
This kinda sounds like the controllers are not using the right images. Are they running the RISC-V version? That version should inject all the RISC-V enabled images by default, unless something is overridden in their config. Otherwise, the k0s config snippet that I pasted in a previous comment can be used to achieve the same when running vanilla k0s controllers. Note that the image configuration is managed by the k0s controllers, not by the workers, so you definitely need to check the k0s config on the controllers (no matter on which arch they're running), not on the individual workers. The containerd pause image snippet you're referring to is explicitly managed by k0s. It should get the pause image that's configured in the k0s controller's config. You can check the currently active pause image also in the worker-config ConfigMaps. For the default worker profile, you can issue the following command:
$ sudo k0s kc -n kube-system get cm worker-config-default-1.28 -ojson | jq .data.pauseImage
"{\"image\":\"quay.io/twz123/pause\",\"version\":\"3.9-1@sha256:266cc1ad730c2a1adc10a60e7b6216ad13095ec5b0329336a557141306ec4625\"}"
If that's listing the vanilla pause image, then something is off. If the ConfigMap lists the right image, but the containerd config doesn't receive it, then there might be some problem with k0s's containerd config management. Can you confirm that a single node cluster behaves as expected, e.g. by running sudo k0s controller --single
with the RISC-V build? It doesn't really matter on which architecture. That k0s version should always use the custom images by default, and also inject the containerd config snippet accordingly. Editing /var/lib/k0s/worker-profile.yaml
is not going to work. This is just a cache file that's used during worker startup. As soon as the worker is able to connect to the cluster, it will re-sync that file from the worker-config ConfigMap.
BTW, you don't use dynamic configuration, do you? In that case, you should check the k0s configuration that's stored in the cluster instead of the one that's used to start the controllers.
Spot on. I added the riscv64 node to the config, so k0sctl is trying to add it first rather than "upgrade" the existing nodes. I'll try running k0sctl without the new node in the config first and then add it back in.
# kc get no --show-labels
NAME STATUS ROLES AGE VERSION LABELS
lab001-004 Ready <none> 116d v1.28.2+k0s-dirty beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=lab001-004,kubernetes.io/os=linux
lab001-005 Ready,SchedulingDisabled <none> 32d v1.28.2+k0s-dirty beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=lab001-005,kubernetes.io/os=linux
lab001-006 Ready <none> 112d v1.28.2+k0s-dirty beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=lab001-006,kubernetes.io/os=linux
lab001-007 Ready <none> 110d v1.28.2+k0s-dirty beta.kubernetes.io/arch=arm64,beta.kubernetes.io/os=linux,kubernetes.io/arch=arm64,kubernetes.io/hostname=lab001-007,kubernetes.io/os=linux
lab001-008 Ready <none> 17h v1.28.2+k0s-dirty beta.kubernetes.io/arch=riscv64,beta.kubernetes.io/os=linux,kubernetes.io/arch=riscv64,kubernetes.io/hostname=lab001-008,kubernetes.io/os=linux
lab001-vm001 Ready <none> 110d v1.28.2+k0s-dirty beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=lab001-vm001,kubernetes.io/os=linux
# kc get no lab001-008 -oyaml
...
status:
...
nodeInfo:
architecture: riscv64
bootID: a0190eaa-2e58-413b-b636-2605b165183b
containerRuntimeVersion: containerd://1.7.6
kernelVersion: 6.5.7-2-starfive
kubeProxyVersion: v1.28.2+k0s-dirty
kubeletVersion: v1.28.2+k0s-dirty
machineID: bda70c9096751d6a9612736a640f01e9
operatingSystem: linux
osImage: Alpine Linux edge
systemUUID: bda70c9096751d6a9612736a640f01e9
# kc -n riscv64-test get po -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
riscv64-test-7f677656cc-q286s 1/1 Running 0 2m46s 10.244.5.8 lab001-008 <none> <none>
# kc -n riscv64-test exec -it riscv64-test-7f677656cc-q286s -- /bin/sh
/ # arch
riscv64
/ #
Uuuh, sweeet!
Anything else you need me to test for proper functionality?
I'm guessing the next step is figuring out some way to do E2E testing. I'm going to guess qemu isn't an option. There's some hardware on the horizon that should make this doable, but currently available boards probably aren't ideal. The easiest boards to get right now are probably VisionFive 2's which are roughly equivalent to a RPi3 speed wise. Lichee Pi 4a's are relatively new, but I haven't gotten Alpine running on mine yet. There's a Lichee Pi 4a cluster board coming soon (the LM4A is a CM3 style baseboard, the Lichee Pi 4a is the LM4A + a carrier). That's 7x LM4A on a cluster board. There's also a 64 core Milk-V Pioneer shipping soonish as well.
Thank you for your dedication to testing k0s on RISC-V. Much appreciated! Your multi-arch cluster looks awesome. :smile:
K0s has a rather comprehensive integration test suite. I've already managed to get a good portion of those tests passing on RISC-V. I haven't had the chance to thoroughly investigate the failures yet, but the ones I looked at boiled down to require some binary (helm, cri-dockerd) or OCI image that's not available for RISC-V yet. I didn't spend any efforts yet on Calico, for example. If you're dedicated enough, you could try and see if there's any "real" integration test failures that are failing due to different reasons than something has to be compiled for RISC-V.
We could take it step by step from here.
-
Some (tiny) patches to the k0s codebase are not yet merged. The goal would be to have
make GO='' EMBEDDED_BINS_BUILDMODE=none
working on RISC-V using a clean checkout. -
A minor patch needs to be applied to the Kubernetes build system to add riscv64 as a supported architecture. Not sure this will make it upstream anytime soon.
-
Maybe QEMU is an option for testing. That would rather happen on a scheduled basis than on a per-PR basis. On the positive side of things, in contrast to Windows, I don't expect many RISC-V specific problems that aren't observable on other arches. (Okay, Windows is an OS, not an architecture, but you get my point :see_no_evil:)
-
Lastly, the biggest issue: The build pipeline for the OCI images provided by k0s needs to be adjusted to produce RISC-V images. That even includes the base OCI image for k0s's build process. I've built the OCI images for this test in an ad-hoc fashion to get things going. That's nothing that I'd like to upstream.
I don't see the k0s project providing any RISC-V binaries during its release process just yet. For that, Kubernetes should be buildable on RISC-V without the need for patches, and we'd probably need real hardware to test on. But: building from source should be easy.
So maybe the goal of this issue could be:
- Make k0s's build process work out of the box on RISC-V
- Document how k0s can be built on RISC-V
- Provide k0s multi-arch OCI images also for RISC-V
- Potentially add some QEMU based scheduled testing
What do you think?
BTW, I did all my testing on the LicheePi 4A. Aware of both the LicheePi Cluster and the Milk-V Pioneer. Having hardware is one thing, but I failed to get GitHub Runner working on the LiPi (Dotnet is currently in the process of adding RISC-V support as well, tried to compile something but this is massive...), so those machines wouldn't be integratable into the k0s GitHub workflows without further tricks.
Spot on. I added the riscv64 node to the config, so k0sctl is trying to add it first rather than "upgrade" the existing nodes. I'll try running k0sctl without the new node in the config first and then add it back in.
I wonder if k0sctl has some logic around that. When doing a cluster upgrade, say from 1.27 to 1.28 and adding a worker node in the same go, then k0sctl would first add the 1.28 worker before updating the control plane? That sounds like the wrong order. Ever thought about this @kke?
The above example doesn't apply to iggy's case, since this was not really a cluster upgrade (the minor version remained the same), but could it be a good idea to update the control plane before adding nodes?
@ncopa What would be the appropriate steps to get binutils-gold built for riscv64 on alpine edge?