Unable to Connect to LXD REST API - Port :8443 Not Listening
Required information
- Distribution: Ubuntu
- Distribution version: 20.04.1
- The output of "snap list --all lxd core20 core22 core24 snapd":
Name Version Rev Tracking Publisher Notes
core20 20231123 2105 latest/stable canonical✓ base,disabled
core20 20240111 2182 latest/stable canonical✓ base
core22 20231123 1033 latest/stable canonical✓ base,disabled
core22 20240111 1122 latest/stable canonical✓ base
lxd 5.20-a8d6c52 26955 latest/stable canonical✓ disabled
lxd 5.20-f3dd836 27049 latest/stable canonical✓ -
snapd 2.60.4 20290 latest/stable canonical✓ snapd,disabled
snapd 2.61.1 20671 latest/stable canonical✓ snapd
- The output of "lxc info" or if that fails:
- Kernel version: 5.4.0-56-generic
- LXC version: 5.20
- LXD version: 5.20
- Storage backend in use: btrfs
Issue description
We are using the LXD REST API on port :8443. However, we have encountered an issue where the LXD daemon has stopped listening on port 8443.
Expected Behavior
$ sudo ss -antp | grep LISTEN | grep 8443
LISTEN 0 16384 *:8443 *:* users:(("lxd",pid=340433,fd=22))
$ ps auxfww | grep [3]40433
root 340433 10.3 0.1 8434504 245884 ? Sl Feb29 174:35 \_ lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
Current Behavior
$ sudo ss -antp | grep LISTEN | grep 8443
$
-> Port 8443 is not listening
$ ps auxfww | grep "[l]xd --logfile"
root 978614 9.9 0.1 8271916 283820 ? Sl Feb10 2952:47 \_ lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
-> However, the LXD daemon process is still running
Context
We automatically create LXD containers for GitHub Actions runners. Occasionally, our daemon (implemented in Go) is unable to connect to LXD. This is the situation we have discovered. We have found that restarting snap.lxd.daemon seems to resolve the issue, as evidenced by the following logs:
Mar 01 16:49:46 myshoes-lxd-010 systemd[1]: Stopping Service for snap application lxd.daemon...
Mar 01 16:49:46 myshoes-lxd-010 lxd.daemon[2101994]: => Stop reason is: host shutdown
Mar 01 16:49:46 myshoes-lxd-010 lxd.daemon[2101994]: => Stopping LXD (with container shutdown)
Mar 01 16:50:16 myshoes-lxd-010 lxd.daemon[978614]: time="2024-03-01T16:50:16+09:00" level=warning msg="Failed shutting down instance, forcefully stopping" err="Failed shutting>
Mar 01 16:50:25 myshoes-lxd-010 lxd.daemon[978614]: time="2024-03-01T16:50:25+09:00" level=error msg="Failed to cleanly shutdown daemon" err="Shutdown endpoints: close tcp [::]>
Mar 01 16:50:25 myshoes-lxd-010 lxd.daemon[978614]: Error: Shutdown endpoints: close tcp [::]:8443: use of closed network connection
Mar 01 16:50:25 myshoes-lxd-010 lxd.daemon[978441]: => LXD failed with return code 1
Mar 01 16:50:25 myshoes-lxd-010 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Mar 01 16:50:25 myshoes-lxd-010 lxd.daemon[2101994]: ==> Stopped LXD
Mar 01 16:50:25 myshoes-lxd-010 lxd.daemon[2101994]: => Stopping LXCFS
Mar 01 16:50:25 myshoes-lxd-010 lxd.daemon[1230]: Running destructor lxcfs_exit
Mar 01 16:50:26 myshoes-lxd-010 lxd.daemon[2101994]: ==> Stopped LXCFS
Mar 01 16:50:26 myshoes-lxd-010 lxd.daemon[2101994]: => Cleaning up PID files
Mar 01 16:50:26 myshoes-lxd-010 lxd.daemon[2101994]: => Cleaning up namespaces
Mar 01 16:50:26 myshoes-lxd-010 lxd.daemon[2101994]: => All done
Mar 01 16:50:26 myshoes-lxd-010 systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Any advice or guidance on how to resolve this issue would be greatly appreciated.
Steps to reproduce
We can't find it yet.
Information to attach
- [ ] Any relevant kernel output (
dmesg) - [ ] Container log (
lxc info NAME --show-log) - [ ] Container configuration (
lxc config show NAME --expanded) - [ ] Main daemon log (at /var/log/lxd/lxd.log or /var/snap/lxd/common/lxd/logs/lxd.log)
- [ ] Output of the client with --debug
- [ ] Output of the daemon with --debug (alternatively output of
lxc monitorwhile reproducing the issue)
We normally encourage support requests to be posted over at https://discourse.ubuntu.com/c/lxd/support/149
In this case it does not appear you have been able to identify reproducer steps to make the issue occur. Is this correct?
Closing due to lack of response and because this would be better posted on the forum.