k3d icon indicating copy to clipboard operation
k3d copied to clipboard

[BUG] `k3d cluster create` stuck at the second cluster creation

Open Mossaka opened this issue 1 year ago • 1 comments

What did you do

  • How was the cluster created?

    • I first created a cluster with 2 agents: k3d cluster create mycluster --agents 2, it ran successfully.
    • Then I created the same cluster with a different name: k3d cluster create mycluster2 --agents 2 and it stuck.
  • What did you do afterwards?

    • I exit the stuck command, and then ran k3d cluster ls and it looks like the second cluster was created but the nodes were not ready
k3d cluster ls                                    
NAME         SERVERS   AGENTS   LOADBALANCER
mycluster    1/1       2/2      true
mycluster2   1/1       2/2      true

I ran docker logs k3d-mycluster2-server-0 and it looks like

Error from server (NotFound): nodes "k3d-mycluster2-server-0" not found
time="2024-01-27T00:49:20Z" level=info msg="Waiting for containerd startup: rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
time="2024-01-27T00:49:20Z" level=info msg="Waiting for control-plane node k3d-mycluster2-server-0 startup: nodes \"k3d-mycluster2-server-0\" not found"
time="2024-01-27T00:49:21Z" level=info msg="Waiting for containerd startup: rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
time="2024-01-27T00:49:21Z" level=info msg="Waiting for control-plane node k3d-mycluster2-server-0 startup: nodes \"k3d-mycluster2-server-0\" not found"
time="2024-01-27T00:49:22Z" level=info msg="Waiting for containerd startup: rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
time="2024-01-27T00:49:22Z" level=info msg="Waiting for control-plane node k3d-mycluster2-server-0 startup: nodes \"k3d-mycluster2-server-0\" not found"
Error from server (NotFound): nodes "k3d-mycluster2-server-0" not found
time="2024-01-27T00:49:23Z" level=info msg="Waiting for containerd startup: rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"

And this is cluster 2 agent's log

time="2024-01-27T00:49:51Z" level=info msg="Waiting for containerd startup: rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
time="2024-01-27T00:49:52Z" level=info msg="Waiting for containerd startup: rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
time="2024-01-27T00:49:53Z" level=info msg="Waiting for containerd startup: rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
E0127 00:49:53.899114    1361 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
E0127 00:49:53.899418    1361 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
E0127 00:49:53.900883    1361 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
E0127 00:49:53.902268    1361 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
The connection to the server localhost:8080 was refused - did you specify the right host or port?
time="2024-01-27T00:49:54Z" level=info msg="Waiting for containerd startup: rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
time="2024-01-27T00:49:55Z" level=info msg="Waiting for containerd startup: rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"

What did you expect to happen

I expect that both clusters to be created sucessfully

Screenshots or terminal output

image

Which OS & Architecture

arch: x86_64
cgroupdriver: systemd
cgroupversion: "2"
endpoint: /var/run/docker.sock
filesystem: UNKNOWN
infoname: devbox
name: docker
os: Ubuntu 22.04.3 LTS
ostype: linux
version: 24.0.7

Which version of k3d

k3d version v5.6.0
k3s version v1.27.4-k3s1 (default)

Which version of docker

Client: Docker Engine - Community
 Version:           24.0.7
 API version:       1.43
 Go version:        go1.20.10
 Git commit:        afdd53b
 Built:             Thu Oct 26 09:07:41 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          24.0.7
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.10
  Git commit:       311b9ff
  Built:            Thu Oct 26 09:07:41 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.26
  GitCommit:        3dd1e886e55dd695541fdcd67420c2888645a495
 runc:
  Version:          1.1.10
  GitCommit:        v1.1.10-0-g18a0cb0
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Mossaka avatar Jan 27 '24 00:01 Mossaka

Anyone running in to this error I wanted to let you know for us it was a file limit error that was totally buried by the log spamming the logs above, cranking up the ulimit fixed the problem.

muttonhead avatar Jun 28 '24 20:06 muttonhead