nerdctl
nerdctl copied to clipboard
[Cilium] Executing nerdctl run in k8 environment is stuck
Description
Executing nerdctl run in the k8 environment is stuck, but k8s can create pods normally
Steps to reproduce the issue
1.[root@m1 ~]# nerdctl ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f0571a9094ce quay.io/cilium/hubble-ui-backend@sha256:0e0eed917653441fded4e7cdb096b7be6a3bddded5a2dd10812a27b1fc6ed95b "/usr/bin/backend" 6 minutes ago Up k8s://kube-cilium/hubble-ui-77555d5dcf-pj77v/backend 046ba04231f7 docker.io/wangyanglinux/myapp:v1 "nginx -g daemon off;" 6 minutes ago Up k8s://default/test-z2gms/test 5c6c52541c37 docker.io/wzxmtlw/metrics-server:v0.6.3 "/metrics-server --c…" 6 minutes ago Up k8s://kube-system/metrics-server-5c7b6df7d8-md58r/metrics-server fcb24a33d77a quay.io/cilium/hubble-relay@sha256:d352d3860707e8d734a0b185ff69e30b3ffd630a7ec06ba6a4402bed64b4456c "hubble-relay serve" 7 minutes ago Up k8s://kube-cilium/hubble-relay-7bc7544857-95dqm/hubble-relay ....
2.[root@m1 ~]# nerdctl run --name test --rm -it busybox:1.28 /bin/sh
Executing the above command gets stuck
3.Can nerdctl run be executed outside the k8s environment
Describe the results you received and expected
null
What version of nerdctl are you using?
[root@m1 ~]# nerdctl version Client: Version: v2.0.2 OS/Arch: linux/amd64 Git commit: 1220ce7ec2701d485a9b1beeea63dae3da134fb5 buildctl: Version: v0.17.1 GitCommit: 8b1b83ef4947c03062cdcdb40c69989d8fe3fd04
Server: containerd: Version: v2.0.1 GitCommit: 88aa2f531d6c2922003cc7929e51daf1c14caa0a runc: Version: 1.2.2 GitCommit: v1.2.2-0-g7cb36325
Are you using a variant of nerdctl? (e.g., Rancher Desktop)
None
Host information
[root@m1 ~]# nerdctl info Client: Namespace: k8s.io Debug Mode: false
Server: Server Version: v2.0.1 Storage Driver: overlayfs Logging Driver: json-file Cgroup Driver: systemd Cgroup Version: 2 Plugins: Log: fluentd journald json-file none syslog Storage: native overlayfs Security Options: seccomp Profile: builtin cgroupns Kernel Version: 5.14.0-427.13.1.el9_4.x86_64 Operating System: Rocky Linux 9.4 (Blue Onyx) OSType: linux Architecture: x86_64 CPUs: 4 Total Memory: 3.793GiB Name: m1 ID: b26f2865-ca8a-49fa-a3a2-ec66adae9813
[root@m1 ~]# kubectl version Client Version: v1.31.4 Kustomize Version: v5.4.2 Server Version: v1.31.4
@wzxmt I am not sure how to reproduce your problem.
Against a kind cluster, things are working just fine / as expected.
I need more details about your specific deployment.
- How can I reproduce it from scratch?
- How did you create your kube cluster exactly?
- What else is involved here?
- What are your containerd details?
- re-run the failing/stuck nerdctl command with
--debug-full
@wzxmt I am not sure how to reproduce your problem.
Against a kind cluster, things are working just fine / as expected.
I need more details about your specific deployment.
- How can I reproduce it from scratch?
- How did you create your kube cluster exactly?
- What else is involved here?
- What are your containerd details?
- re-run the failing/stuck nerdctl command with
--debug-full
My K8s deployment method uses binary deployment, and I tried again. Running "nerdctl run --name test --rm -it busybox:1.28 /bin/sh" in Flannel mode works without any stutter, but it stutters in Cilium mode. Here are the deployment modes:
linux-amd64/helm template cilium cilium/cilium --version 1.15.11
--namespace kube-cilium
--set operator.replicas=1
--set k8sServiceHost=apiserver.cluster.local
--set k8sServicePort=8443
--set ipv4NativeRoutingCIDR=172.16.0.0/16
--set ipam.operator.clusterPoolIPv4PodCIDRList=172.16.0.0/16
--set hubble.relay.enabled=true
--set hubble.ui.enabled=true
--set hubble.ui.service.type=NodePort
--set hubble.ui.service.nodePort=31235
--set routing-mode=native
--set kubeProxyReplacement=strict
--set bpf.masquerade=true
--set bandwidthManager.enabled=true >>${HOST_PATH}/roles/components/templates/cilium.yaml
[root@m1 ~]# containerd -v containerd github.com/containerd/containerd/v2 v2.0.1 88aa2f531d6c2922003cc7929e51daf1c14caa0a
[root@m1 ~]# nerdctl info Client: Namespace: k8s.io Debug Mode: false
Server: Server Version: v2.0.1 Storage Driver: overlayfs Logging Driver: json-file Cgroup Driver: systemd Cgroup Version: 2 Plugins: Log: fluentd journald json-file none syslog Storage: native overlayfs Security Options: seccomp Profile: builtin cgroupns Kernel Version: 5.14.0-427.13.1.el9_4.x86_64 Operating System: Rocky Linux 9.4 (Blue Onyx) OSType: linux Architecture: x86_64 CPUs: 6 Total Memory: 5.755GiB Name: m1 ID: 97bb3274-41ea-4a43-a74b-7dc0b86e3fa9
[root@m1 ~]# nerdctl run --name test --rm -it --debug-full busybox:1.28 /bin/sh DEBU[0000] verifying process skipped DEBU[0000] generated log driver: binary:///apps/containerd/bin/nerdctl?_NERDCTL_INTERNAL_LOGGING=%2Fvar%2Flib%2Fnerdctl%2F1935db59
Thanks @wzxmt
What happens with nerdctl network ls, or when starting your container with different networking options? (eg: --net host)
@AkihiroSuda anyone around familiar with Kube + eBPF/Cillium who could help debug this?
nerdctl network ls
I later tried the Calico mode and it worked fine. Running "nerdctl network ls" in Cilium mode still stutters, but other modes can be executed normally.
flannel
[root@m2 ~]# nerdctl network ls NETWORK ID NAME FILE cbr0 /etc/cni/net.d/10-flannel.conflist 17f29b073143 bridge /etc/cni/net.d/nerdctl-bridge.conflist host none
calico
[root@m3 ~]# nerdctl network ls NETWORK ID NAME FILE k8s-pod-network /etc/cni/net.d/10-calico.conflist 17f29b073143 bridge /etc/cni/net.d/nerdctl-bridge.conflist host none
Cilium stutters
[root@m1 ~]# nerdctl network ls
nerdctl network ls
I later tried the Calico mode and it worked fine. Running "nerdctl network ls" in Cilium mode still stutters, but other modes can be executed normally.
Interesting.
Staying stuck is rather unusual. What I am thinking is locking on the same directory. Been browsing Cilium source code, and indeed they do use filesystem locking, possibly on the same directory as us.
@wzxmt if you feel like it, the most helpful thing you could do is:
# clone nerdctl source code
git clone [email protected]:containerd/nerdctl.git
cd nerdctl
# Edit https://github.com/containerd/nerdctl/blob/main/pkg/netutil/netutil.go#L224
# Line 224, find this:
# err = lockutil.WithDirLock(e.NetconfPath, fn)
# Replace it with:
# fn()
# Compile a new nerdctl binary
make binaries
# The updated binary is under `_output`
# Now, try again
_output/nerdctl network ls
If it still does not help, you could pepper fmt.Println("debug message something") in this function (and the caller) to figure out where it is getting stuck.
I wish I could test Cilium but I am short on time right now.
Thanks @wzxmt
nerdctl network ls
I later tried the Calico mode and it worked fine. Running "nerdctl network ls" in Cilium mode still stutters, but other modes can be executed normally.
Interesting.
Staying stuck is rather unusual. What I am thinking is locking on the same directory. Been browsing Cilium source code, and indeed they do use filesystem locking, possibly on the same directory as us.
@wzxmt if you feel like it, the most helpful thing you could do is:
clone nerdctl source code
git clone [email protected]:containerd/nerdctl.git cd nerdctl
Edit https://github.com/containerd/nerdctl/blob/main/pkg/netutil/netutil.go#L224
Line 224, find this:
err = lockutil.WithDirLock(e.NetconfPath, fn)
Replace it with:
fn()
Compile a new nerdctl binary
make binaries
The updated binary is under
_outputNow, try again
_output/nerdctl network ls If it still does not help, you could pepper
fmt.Println("debug message something")in this function (and the caller) to figure out where it is getting stuck.I wish I could test Cilium but I am short on time right now.
Thanks @wzxmt
Edit https://github.com/containerd/nerdctl/blob/main/pkg/netutil/netutil.go#L224,make binaries
You can execute nerdctl network ls, and execute nerdctl run --name test --rm -it --debug-full busybox:1.28 /bin/sh but there is still a problem
Thanks a lot @wzxmt
I think this confirms what the issue is: cilium is very likely trying to lock the same directory as nerdctl (likely the cni configuration directory).
The problem here will not be trivial to solve.
We need to flock when accessing the cni conf - this is the only way to prevent racy/concurrent modifications.
What we could do is move the lock to a different location though (purely nerdctl).
cc @AkihiroSuda
I got another issue:
containerd version: 1.7.24 nerdctl version: 1.7.7 (v2.0.2 has also tried before,it's the same)
i have 2 cni configure:
- 10-bridge.conflist : it is for k8s, use bridge plugin, the content is:
{
"cniVersion": "1.0.0",
"name": "k8s-net",
"plugins": [
{
"type": "bridge",
"bridge": "cni1",
"isGateway": true,
"isDefaultGateway": true,
"ipMasq": false,
"mtu": 1360,
"hairpinMode": true,
"ipam": {
"ranges": [
[
{
"subnet": "10.129.32.0/24",
"rangeStart": "10.129.32.1",
"rangeEnd": "10.129.32.126"
}
]
],
"type": "host-local"
}
},
{
"type": "bandwidth"
},
{
"type": "firewall"
},
{
"type": "tuning"
}
]
}
- nerdctl-nerd.conflist: it was create by nerdctl and modified it. the content is:
{
"cniVersion": "1.0.0",
"name": "nerd",
"nerdctlID": "5cabaa953bd37c3e357e779bb82aa195eda3b2afa2bdd19594a7162c4f7497be",
"nerdctlLabels": {},
"plugins": [
{
"name": "cni0",
"type": "macvlan",
"master": "bond0",
"mtu": 1360,
"ipam": {
"ranges": [
[
{
"gateway": "10.129.17.1",
"rangeStart": "10.129.17.24",
"rangeEnd": "10.129.17.63",
"subnet": "10.129.17.0/24"
}
]
],
"routes": [
{ "dst": "0.0.0.0/0", "gw": "10.129.17.1" }
],
"type": "host-local"
}
}
]
}
the k8s works well. but when i used nerdctl to create container and start it,
nerdctl create --name=etcd-openebs --restart=always \
--network=nerd --ip=10.129.17.25 \
--cpus=4.0 --memory=8092 --memory-swap=0 \
--log-driver=json-file \
--log-opt=max-size=500m \
--log-opt=max-file=5 \
--log-opt=log-path=${LOGSDIR}/etcd.log \
-e ETCD_NAME=${ETCD_NAME} \
-v ${CONFDIR}:/etc/etcd \
-v ${DATADIR}:/data/etcd \
${CONTAINER_IMAGE} \
/usr/local/bin/etcd --config-file /etc/etcd/etcd.yml
nerdctl start etcd-openebs
it failed, and got
FATA[0000] 1 errors:
failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: time="2025-02-20T09:15:53+08:00" level=fatal msg="failed to call cni.Setup: plugin type=\"macvlan\" name=\"cni0\" failed (add): failed to allocate for range 0: requested IP address 10.129.17.25 is not available in range set 10.129.17.24-10.129.17.63"
Failed to write to log, write /var/lib/nerdctl/1935db59/containers/default/b5c3b84c7cc382d563954d684ebb766bc7a36b2bade55e91adb0a89d0533f77c/oci-hook.createRuntime.log: file already closed: unknown
requested IP address 10.129.17.25 is not available in range set 10.129.17.24-10.129.17.63
Doesn't seem relevant to the OP
etcd-openebs
etcd and openebs do not need to be used to reproduce the issue?
requested IP address 10.129.17.25 is not available in range set 10.129.17.24-10.129.17.63
Doesn't seem relevant to the OP
etcd-openebs
etcd and openebs do not need to be used to reproduce the issue?
etcd-openebs is the name of my created containerd...
@AkihiroSuda OP issue is clearly that we lock a directory that Cilium is also trying to lock.
I believe nerdctl should implement locking for networking stuff in a separate, private directory.
Can you assign this to me?
Thanks, can we just create a lock file like .nerdctl.lock in the CNI dir, or will something be angry if there is a non-JSON file in the CNI directory?
Thanks, can we just create a lock file like
.nerdctl.lockin the CNI dir, or will something be angry if there is a non-JSON file in the CNI directory?
Yep, it might be just a simple patch.
I need to look again into locking - specially the platform specific stuff.
requested IP address 10.129.17.25 is not available in range set 10.129.17.24-10.129.17.63
Doesn't seem relevant to the OP
etcd-openebs
etcd and openebs do not need to be used to reproduce the issue?
any container can reproduce the issue....
@nopeno
Your post is irrelevant to the OP. Please open a new issue.
Just stumbling over this issue (K8s+Cilium+nerdctl v2.0.3). Can I bypass locking or override the locking path as a workaround?
Just stumbling over this issue (K8s+Cilium+nerdctl v2.0.3). Can I bypass locking or override the locking path as a workaround?
I can't think of any way to do that top of the head... Also note that at this point, the lock explanation is an (informed) hypothesis, not a firm root cause...
I'll look into it soon anyhow.
@fahedouch / @AkihiroSuda could we tentatively slate that for the next patch release / milestone to 2.x.x?
@wzxmt thanks a lot for getting through with all the info, that was really helpful.
I have a patch in the linked pr #4165
If you feel like it, would you be able to build from it and try in your context?
Cc @stephan2012 as well if you feel like trying.
Thanks folks.
Was this fixed/alleviated in #4165?
@wzxmt非常感谢您提供的所有信息,这真的很有帮助。
我在链接的 pr #4165中有一个补丁
如果您愿意,您是否能够以此为基础并在您的环境中进行尝试?
抄送@stephan2012如果您想尝试的话也可以。
谢谢大家。
@wzxmt thanks a lot for getting through with all the info, that was really helpful.
I have a patch in the linked pr #4165
If you feel like it, would you be able to build from it and try in your context?
Cc @stephan2012 as well if you feel like trying.
Thanks folks.
Updating nerdctl version to 2.0.5 did not solve the problem!
@wzxmt
With 2.0.5, does nerdctl network ls work with cilium?
yes
---Original--- From: @.> Date: Wed, May 7, 2025 00:55 AM To: @.>; Cc: @.@.>; Subject: Re: [containerd/nerdctl] [Cilium] Executing nerdctl run in k8environment is stuck (Issue #3783)
apostasie left a comment (containerd/nerdctl#3783)
@wzxmt
With 2.0.5, does nerdctl network ls work with cilium?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
yes …
Cool. So, the patch did address the first issue (concurrent locking with Cilium), which is good. We now have a second problem here.
It does not feel like I can continue just reading tea leaves though. I need to reproduce your environment.
@wzxmt can you share how you are installing and configuring? eg: I have a kind cluster - how do I setup cilium the same way as you?
Thanks in advance.
yes …
Cool. So, the patch did address the first issue (concurrent locking with Cilium), which is good. We now have a second problem here.
It does not feel like I can continue just reading tea leaves though. I need to reproduce your environment.
@wzxmt can you share how you are installing and configuring? eg: I have a kind cluster - how do I setup cilium the same way as you?
Thanks in advance.
install k8s :
kubeadm init --kubernetes-version=1.32.3 --apiserver-advertise-address=10.0.0.51 --control-plane-endpoint=10.0.0.51:6443 --service-cidr=10.96.0.0/16 --pod-network-cidr=172.16.0.0/16 --upload-certs
install cilium:
helm install cilium cilium/cilium --version 1.17.1
--namespace kube-cilium
--set operator.replicas=1
--set k8sServiceHost=10.0.0.51
--set k8sServicePort=6443
--set ipv4NativeRoutingCIDR=172.16.0.0/16
--set ipam.operator.clusterPoolIPv4PodCIDRList=172.16.0.0/16
--set routing-mode=native
--set kubeProxyReplacement=true
--set bpf.masquerade=true
--set envoy.enabled=true
--set bandwidthManager.enabled=true
Thanks @wzxmt
I will set it up locally and figure this out. Unfortunately, this is not going to happen in time for the 2.1 release which is due today.