tacticalrmm icon indicating copy to clipboard operation
tacticalrmm copied to clipboard

Existing /rmm/daphne.sock causes daphne.service fail to start

Open NiceGuyIT opened this issue 1 year ago • 2 comments

Server Info (please complete the following information):

  • OS: Ubuntu 20.04.3 LTS
  • Browser: Chromium Version 103.0.5060.53
  • RMM Version (as shown in top left of web UI): v0.13.4

Installation Method:

  • [x] Standard
  • [ ] Docker

Agent Info (please complete the following information):

  • Agent version (as shown in the 'Summary' tab of the agent from web UI): N/A
  • Agent OS: N/A

Describe the bug See below.

To Reproduce Steps to reproduce the behavior:

  1. Shutdown the server hard to leave the socket behind.

Note: Killing the process with SIGSEGV will cause the socket to remain behind but daphne is able to cleanup the leftovers and start the service.

Expected behavior I expect Daphne to start after a power failure.

Screenshots See below.

Additional context My dev server experienced a power outage which caused /rmm/daphne.sock to remain.

$ ls -l /rmm/daphne.sock*
srw-rw-rw- 1 tactical www-data 0 Jul  5 07:59 /rmm/daphne.sock
lrwxrwxrwx 1 tactical www-data 3 Jul  5 07:59 /rmm/daphne.sock.lock -> 108

The process no longer exists.

$ ps -ef | grep 108
root      9199  6560  0 09:20 pts/2    00:00:00 grep --color=auto 108

This in turn caused daphne.service to fail upon startup. Notice the restart counter at 15k.

Jul 17 09:12:22 ns-v18-tactical systemd[1]: Started django channels daemon.
Jul 17 09:12:24 ns-v18-tactical daphne[8619]: 2022-07-17 13:12:24,349 INFO     Starting server at unix:/rmm/daphne.sock
Jul 17 09:12:24 ns-v18-tactical daphne[8619]: 2022-07-17 13:12:24,350 INFO     HTTP/2 support not enabled (install the http2 and tls Twisted extras)
Jul 17 09:12:24 ns-v18-tactical daphne[8619]: 2022-07-17 13:12:24,350 INFO     Configuring endpoint unix:/rmm/daphne.sock
Jul 17 09:12:24 ns-v18-tactical daphne[8619]: 2022-07-17 13:12:24,350 CRITICAL Listen failure: [Errno 1] Operation not permitted
Jul 17 09:12:24 ns-v18-tactical systemd[1]: daphne.service: Succeeded.
Jul 17 09:12:27 ns-v18-tactical systemd[1]: daphne.service: Scheduled restart job, restart counter is at 15263.
Jul 17 09:12:27 ns-v18-tactical systemd[1]: Stopped django channels daemon.

Here's the correspondig error in the /rmm/api/tacticalrmm/tacticalrmm/private/log/*.log. Note: This is a dev box not on the internet; the access_token doesn't matter.

==> error.log <==
2022/07/17 09:13:33 [error] 253#253: *24234 connect() to unix:/rmm/daphne.sock failed (111: Connection refused) while connecting to upstream, client: 172.30.0.119, server: api.a8n.tools, request: "GET /ws/dashinfo/?access_token=26cb1c9fb894f87352377b35d2ba90ca7fb6cd625725046f9e5b55579f8096b1 HTTP/1.1", upstream: "http://unix:/rmm/daphne.sock:/ws/dashinfo/?access_token=26cb1c9fb894f87352377b35d2ba90ca7fb6cd625725046f9e5b55579f8096b1", host: "api.a8n.tools"

==> access.log <==
172.30.0.119 - - [17/Jul/2022:09:13:33 -0400] "GET /ws/dashinfo/?access_token=26cb1c9fb894f87352377b35d2ba90ca7fb6cd625725046f9e5b55579f8096b1 HTTP/1.1" 502 552 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.53 Safari/537.36"

Adding ExecStartPre to remove the socket and lock file makes the service start correctly.

[Unit]
Description=django channels daemon
After=network.target

[Service]
User=tactical
Group=www-data
WorkingDirectory=/rmm/api/tacticalrmm
Environment="PATH=/rmm/api/env/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
ExecStart=/rmm/api/env/bin/daphne -u /rmm/daphne.sock tacticalrmm.asgi:application
ExecStartPre=rm -f /rmm/daphne.sock
ExecStartPre=rm -f /rmm/daphne.sock.lock
Restart=always
RestartSec=3s

[Install]
WantedBy=multi-user.target

systemctl status daphne.service

$ systemctl status daphne.service
● daphne.service - django channels daemon
     Loaded: loaded (/etc/systemd/system/daphne.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2022-07-17 09:33:44 EDT; 2min 19s ago
    Process: 10206 ExecStartPre=/usr/bin/rm -f /rmm/daphne.sock (code=exited, status=0/SUCCESS)
    Process: 10207 ExecStartPre=/usr/bin/rm -f /rmm/daphne.sock.lock (code=exited, status=0/SUCCESS)
   Main PID: 10208 (daphne)
     CGroup: /system.slice/daphne.service
             └─10208 /rmm/api/env/bin/python3.10 /rmm/api/env/bin/daphne -u /rmm/daphne.sock tacticalrmm.asgi:applicati…

Jul 17 09:33:44 ns-v18-tactical systemd[1]: Starting django channels daemon...
Jul 17 09:33:44 ns-v18-tactical systemd[1]: Started django channels daemon.
Jul 17 09:33:46 ns-v18-tactical daphne[10208]: 2022-07-17 13:33:46,514 INFO     Starting server at unix:/rmm/daphne.sock
Jul 17 09:33:46 ns-v18-tactical daphne[10208]: 2022-07-17 13:33:46,514 INFO     HTTP/2 support not enabled (inst…extras)
Jul 17 09:33:46 ns-v18-tactical daphne[10208]: 2022-07-17 13:33:46,515 INFO     Configuring endpoint unix:/rmm/d…ne.sock
Hint: Some lines were ellipsized, use -l to show in full.

NiceGuyIT avatar Jul 17 '22 13:07 NiceGuyIT

I'll get that added into the service definition in my script.

ninjamonkey198206 avatar Jul 18 '22 23:07 ninjamonkey198206

Done. If they approve my PR or just scavenge it for parts, including default files, it'll be in there.

ninjamonkey198206 avatar Jul 20 '22 18:07 ninjamonkey198206

thank you, added ExecStartPre

wh1te909 avatar Oct 07 '22 17:10 wh1te909