backup-restore-operator Migration from k3s local cluster to rke2 breaks with config restore

Rancher Server Setup

Rancher version: 2.6.9
Installation option (Docker install/Helm Chart): rancher-latest/rancher --version 2.6.9
Kubernetes Version and Engine: v1.23.13+rke2r1

Describe the bug TL;DR - migration from RMS on a k3s local cluster to use rke2 as the local cluster causes the operator to "register" the local rke2 cluster as k3s, triggering the automated upgrade on the local cluster to try to "upgrade" rke2 to k3s, to put the local cluster in the same engine version as when the backup was created. This breaks RMS.

Long story - We wanted to migrate our Rancher RMS from a k3s single-node cluster, to an rke2 single-node cluster. When we backed up the RMS configs from k3s cluster (backed up using the backup-restore-operator), and then applied the configs to the new rke2 cluster for migration. When we applied the configs, Rancher suddenly decided the rke2 local cluster was a k3s cluster, and tried to apply the "update strategy" to upgrade the "local" cluster from what it was at (v1.23.13+rke2r1) to what the local cluster was at when it was backed up (v1.24.6+k3s1). Obviously, the automated upgrade from rke2 to k3s fails, but it keeps the local cluster in "unscheduleable" status until the failing upgrade completes, which never completes.

To Reproduce Steps to reproduce the behavior:

Set up RMS on a local cluster running k3s
Migrate RMS to a local cluster running rke2
local cluster running rke2 tries and fails to "upgrade" to k3s, keeping the local cluster in a cordoned state

Expected behavior Migration should be allowed to move to a different local cluster, you should not be limited to the same k8s engine as before

Additional context local cluster definition after applying configs: (cluster running rke2, but now "registered" as a k3s cluster) fleet-local.yaml.txt

Aug 15 '23 14:08 fluzzykitten

Moving this to the RKE2/K3S team as this does not appear to be a bug in the backup/restore utility.

Sep 20 '23 17:09 MKlimuszka

How did we determine that? This certainly appears to be a rancher issue rather than a distro issue, I don't know there's anything we can do from the distro side?

Sep 20 '23 17:09 cwayne18

This is not a distro team issue FWIW, though not entirely sure where it needs to go TBH, taking myself off as assignee

Nov 01 '23 21:11 cwayne18

Just realized this issue is the same as the one on rancher/rancher that I've recently begun working on. So I'm assigning myself here to match that one: https://github.com/rancher/rancher/issues/42158

Apr 29 '24 15:04 mallardduck

backup-restore-operator backup-restore-operator copied to clipboard

Migration from k3s local cluster to rke2 breaks with config restore

backup-restore-operator
backup-restore-operator copied to clipboard