k3d
k3d copied to clipboard
[Feature] Create/Restore Cluster Snapshots
Scope of your request
Be able to create snapshots for complex clusters and restore them at will
I think this is very useful for clusters with stateful sets that take long to be created, in my case my Local Kafka + Zookeeper cluster takes around 10 minutes to be fully configured and populated, but I only need to do that once every couple months.
Describe the solution you'd like
This project is extremely helpful, I opted to use it instead of plain k3s because I saw a possibility to use docker commit
as a snapshot tool, so I could iterate fast.
In case I break something I don't care too much about, I could just restart from that snapshot I commited and start adding my bugs to my code base again, very fast.
If it was a k3d native command it would be perfect but docker is fine for now
Describe alternatives you've considered
I tried and succeeded in creating the snapshot from a working k3d cluster with
docker commit -m "snapshot" "$(docker ps --filter name=k3d-k3s-local-server -q)" rancher/k3s:v0.10.0-snapshot
After that I run
k3d delete -a
and
docker run 53cb9ed4ec58
but I fail to restore my cluster to the initial state.
I can create a PR for this later but I am in need of some guidance on what needs to be done for this kind of approach to succeed.
In the beginning this docker commit
and docker run
approach would already be very useful if it worked.
The current error I see when starting a single server cluster with no agents is
Failed to get the info of the filesystem with mountpoint "/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs": unable to find data in memory cache.
So I am missing some mount point, I am just not sure what I need to manually recreate related to this https://github.com/rancher/k3s/issues/495, I guess k3d delete
is removing this mount
Hi there, thanks for opening this issue. This is surely an interesting feature to have :+1: I'm not sure, how to proceed to get this working to be honest. The mountpoint that you're missing there is a subdirectory of one of the volumes created within the k3s Dockerfile (see https://github.com/rancher/k3s/blob/master/package/Dockerfile).
I'd be happy to review any pull-request from your side and will have another look into this issue once I have some more time :+1:
Ok I will do my best, let's see what happens.
Is there any progress on this, or something similar? I would also be interested in the functionality. If not, I would be interested in giving it a try as well, though I could not get to it in the next 2-3 weeks.
Pĺease do, I have too much on my plate right now ( 2 jobs since Dec/2020) , so I unfortunatly I couldn't do anything.
I just had a few more thoughts on this and now here are some things to note:
-
for simple single-server clusters (at least without agents), it's enough to do
docker volume create k3d-test k3d cluster create k3d-test -v k3d-test:/var/lib/rancher/k3s # ... do something with the cluster ... k3d cluster delete k3d-test k3d cluster create k3d-test -v k3d-test:/var/lib/rancher/k3s
to have the same state as before. This also works with
docker cp
'ing the contents of that directory and then copying it into place or bind-mounting the directory.- Problem: if you change the cluster name when running the new cluster, it will show the containers as running, but they're assigned to the original node name and the original node will also show up in
kubectl get nodes
, making the pods inaccessible i.e. viakubectl exec
. All pods then have to be re-created (e.g.kubectl delete pods -A --all
).
- Problem: if you change the cluster name when running the new cluster, it will show the containers as running, but they're assigned to the original node name and the original node will also show up in
-
in a multi-server cluster, one has to have exactly the same IP Range again for the new nodes as one had for the old nodes, as etcd internally uses the node IPs as identifiers, otherwise the new cluster created with the backed up files, will break
I will give it a try!, that would be enough for me since k3s is our local env and only has one node.
@cfontes , did you have any success so far? I moved this to the backlog now instead of just moving it from milestone to milestone.. :thinking:
@iwilltry42 I executed your proposal of single cluster, but when creating the cluster again, k3d complains as follows :
WARN[0002] warning: encountered fatal log from node k3d-kassio-server-0 (retrying 0/10): �time="2023-10-29T18:40:47Z" level=fatal msg="starting kubernetes: preparing server: bootstrap data already found and encrypted with different token"
at least in version:
k3d version v5.6.0
k3s version v1.27.4-k3s1 (default)
On the other hand, I am trying by simply doing a snapshot of the server container and using it as the image for creating the new cluster (--image option
). However, it seems to ignore what's inside and boots an empty k3s cluster.