nerdctl
nerdctl copied to clipboard
`networks` related code is racy
Description
For some reason, network tests are very racy on my rig.
Just taking network_remove_linux_test
, I get several different conditions quite fast:
- [ ]
task xyz not found: not found
- [ ]
reading /etc/cni/net.d/nerdctl-nerdctl-testnetworkremovebyid.conflist: open /etc/cni/net.d/nerdctl-nerdctl-testnetworkremovebyid.conflist: no such file or directory
- [ ]
failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: time=\"2024-06-13T03:36:47Z\" level=fatal msg=\"no such network: \\\"nerdctl-testnetworkprune\\\"\"\nFailed to write to log, write /var/lib/nerdctl/1935db59/containers/nerdctl-testnetworkprune/9c8747e53f2fa1cba99145d19f48648a85d63ccf5ae7167038a3536336c2d6aa/oci-hook.startContainer.log: file already closed: unknown
From a cursory reading, it feels to me like number 1 is somewhere in netutils
making assumptions about the availability of objects.
Number 2 is probably also in netutils - seems to me like racyness between checking that a network exist and a later operation that depends on reading the config. Could be that for certain operation we do not use filelock (properly).
Number 3 is more worrisome.
Steps to reproduce the issue
go test
Describe the results you received and expected
Fail 1 out of 10 times with a variety of different reasons.
What version of nerdctl are you using?
1.7.6
Are you using a variant of nerdctl? (e.g., Rancher Desktop)
None
Host information
No response