salt icon indicating copy to clipboard operation
salt copied to clipboard

minion.restart module doesn't work when minion run via systemd

Open dmarkwat opened this issue 6 years ago • 10 comments

Description of Issue/Question

This linked line is where the trip-up seems to happen. When the minion is started with systemd, the '-d' flag is not passed on sys.argv to the minion. Thus, the else condition removes restart on systemd-based machines. This however does work on sysvinit-based setups (RHEL 6 in my case) as -d is passed to salt in sys.argv.

Setup

Bare CentOS/RHEL 7 setup with salt-minion installed via yum and started via systemctl should be sufficient.

Steps to Reproduce Issue

On master: salt 'rhel7-minion' minion.restart Output looks something like:

rhel7-minion:
----------
    comment:
        - Not running in daemon mode - will not restart process after killing
    killed:
        7347
    restart:
        ----------
    retcode:
        0

The comment line is traced back to this line here

Versions Report

Master

Salt Version:
           Salt: 2017.7.4
 
Dependency Versions:
           cffi: 1.6.0
       cherrypy: Not Installed
       dateutil: 1.5
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: 0.21.0
        libnacl: Not Installed
       M2Crypto: 0.21.1
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: Not Installed
      pycparser: 2.14
       pycrypto: 2.6.1
   pycryptodome: 3.4.3
         pygit2: 0.21.4
         Python: 2.7.5 (default, May  3 2017, 07:55:04)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4
 
System Versions:
           dist: redhat 7.4 Maipo
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-693.11.6.el7.x86_64
         system: Linux
        version: Red Hat Enterprise Linux Server 7.4 Maipo

Minion

Salt Version:
           Salt: 2017.7.4
 
Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 1.5
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: 0.21.1
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.7
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: 3.4.3
         pygit2: Not Installed
         Python: 2.7.5 (default, May  3 2017, 07:55:04)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4
 
System Versions:
           dist: redhat 7.4 Maipo
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-693.11.6.el7.x86_64
         system: Linux
        version: Red Hat Enterprise Linux Server 7.4 Maipo

dmarkwat avatar Feb 28 '18 17:02 dmarkwat

Starting with 2017.7.3, with KillMode=process in the systemd service unit, you can use the service.restart unit.

Systemd is much more strict about this, and tracks the cgroup, and does not recommend running daemons in Type=forking, since systemd handles all the forking.

gtmanfred avatar Feb 28 '18 23:02 gtmanfred

I'm seeing a similar issue with restart, but also with stop. I see this:

systemd[1]: Stopping My Mojolicious application workers

But the worker doesn't seem to stop unless I kill it. I do have KillMode=process in my file.

srchulo avatar Feb 03 '19 19:02 srchulo

Should warn if process will not restart without killing the process.

Will try the service.restart method. salt.modules.service is under-documented.

https://docs.saltstack.com/en/latest/ref/modules/all/salt.modules.service.html

# salt 's*' minion.restart
spartanapp:
    ----------
    comment:
        - Not running in daemon mode - will not restart process after killing
    killed:
        22641
    restart:
        ----------
    retcode:
        0

wolfpackmars2 avatar Apr 29 '19 08:04 wolfpackmars2

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

stale[bot] avatar Jan 08 '20 18:01 stale[bot]

@Ch3LL , could this be reopened?

I just did a minion.restart on Ubuntu 18.04 and Salt 3000 - all my minions shut down with Not running in daemon mode - will not restart process after killing

There may be an available workaround/alternative, but I'd submit that this is still a bug, and harmful behavior that should be fixed (or at least prevented).

cruscio avatar Apr 28 '20 14:04 cruscio

salt knows it can not restart the minion process. it should not stop the minion in this case and instead return a failure. not kill the process AND return a failure

ITJamie avatar Jan 24 '22 16:01 ITJamie

Works for me with the following conditions:

  • salt-master 3004.2
  • salt-minion 3004.2
  • salt-minion runs under systemd
  • minion_restart_command: systemctl restart salt-minion line present in /etc/salt/minion

gvfnix avatar Jun 28 '22 08:06 gvfnix

Issue present in 3006.6. basic bootstrap install. onedir with systemd.

master

Salt Version:
          Salt: 3006.6

Python Version:
        Python: 3.10.13 (main, Nov 15 2023, 04:34:27) [GCC 11.2.0]

Dependency Versions:
          cffi: 1.14.6
      cherrypy: unknown
      dateutil: 2.8.1
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.3
       libgit2: Not Installed
  looseversion: 1.0.2
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 22.0
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.19.1
        pygit2: Not Installed
  python-gnupg: 0.4.8
        PyYAML: 6.0.1
         PyZMQ: 23.2.0
        relenv: 0.14.2
         smmap: Not Installed
       timelib: 0.2.4
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: ubuntu 20.04.5 focal
        locale: utf-8
       machine: x86_64
       release: 5.4.0-132-generic
        system: Linux
       version: Ubuntu 20.04.5 focal

minion

Salt Version:
          Salt: 3006.6

Python Version:
        Python: 3.10.13 (main, Nov 15 2023, 04:34:27) [GCC 11.2.0]

Dependency Versions:
          cffi: 1.14.6
      cherrypy: 18.6.1
      dateutil: 2.8.1
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.3
       libgit2: Not Installed
  looseversion: 1.0.2
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 22.0
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.19.1
        pygit2: Not Installed
  python-gnupg: 0.4.8
        PyYAML: 6.0.1
         PyZMQ: 23.2.0
        relenv: 0.14.2
         smmap: Not Installed
       timelib: 0.2.4
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: ubuntu 22.04.1 jammy
        locale: utf-8
       machine: x86_64
       release: 5.15.0-53-generic
        system: Linux
       version: Ubuntu 22.04.1 jammy

CrackerJackMack avatar Feb 04 '24 22:02 CrackerJackMack

Adding minion_restart_command kinda worked, but not entirely smooth.

minion setup prior to test

root@node-0:~# cat /etc/salt/minion.d/restart.conf
minion_restart_command: systemctl restart salt-minion
root@node-0:~# date -u
Sun Feb  4 10:13:48 PM UTC 2024
root@node-0:~# systemctl restart salt-minion
root@node-0:~# systemctl status salt-minion
● salt-minion.service - The Salt Minion
     Loaded: loaded (/lib/systemd/system/salt-minion.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2024-02-04 22:13:53 UTC; 5s ago
       Docs: man:salt-minion(1)
             file:///usr/share/doc/salt/html/contents.html
             https://docs.saltproject.io/en/latest/contents.html
   Main PID: 15065 (python3.10)
      Tasks: 7 (limit: 2160)
     Memory: 57.2M
        CPU: 417ms
     CGroup: /system.slice/salt-minion.service
             ├─15065 /opt/saltstack/salt/bin/python3.10 /usr/bin/salt-minion
             └─15073 "/opt/saltstack/salt/bin/python3.10 /usr/bin/salt-minion MultiMinionProcessManager MinionProcessManager"

Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Setting up the Salt Minion "node-0"
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Starting up the Salt Minion
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Starting pull socket on /var/run/salt/minion/minion_event_7c6cc41e6b_pull.ipc
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Creating minion process manager
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Executing command date in directory '/root'
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Updating job settings for scheduled job: __mine_interval
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Added mine.update to scheduler
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Minion is starting as user 'root'
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Minion is ready to receive requests!
Feb 04 22:13:54 node-0 salt-minion[15073]: [INFO    ] Running scheduled job: __mine_interval with jid 20240204221354956795

master

root@salt-master:~# date -u
Sun 04 Feb 2024 10:14:31 PM UTC
root@salt-master:~# salt node-0 test.ping
node-0:
    True
root@salt-master:~# salt node-0 minion.restart
node-0:
    Minion did not return. [No response]
    The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:

    salt-run jobs.lookup_jid 20240204221438860367
ERROR: Minions returned with non-zero exit code
root@salt-master:~# date -u
Sun 04 Feb 2024 10:14:55 PM UTC
root@salt-master:~# salt node-0 test.ping
node-0:
    True
root@salt-master:~# date -u
Sun 04 Feb 2024 10:15:00 PM UTC

minion systems status

root@node-0:~# date -u
Sun Feb  4 10:15:05 PM UTC 2024
root@node-0:~# systemctl status salt-minion
● salt-minion.service - The Salt Minion
     Loaded: loaded (/lib/systemd/system/salt-minion.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2024-02-04 22:14:41 UTC; 25s ago
       Docs: man:salt-minion(1)
             file:///usr/share/doc/salt/html/contents.html
             https://docs.saltproject.io/en/latest/contents.html
   Main PID: 15150 (python3.10)
      Tasks: 7 (limit: 2160)
     Memory: 57.3M
        CPU: 502ms
     CGroup: /system.slice/salt-minion.service
             ├─15150 /opt/saltstack/salt/bin/python3.10 /usr/bin/salt-minion
             └─15159 "/opt/saltstack/salt/bin/python3.10 /usr/bin/salt-minion MultiMinionProcessManager MinionProcessManager"

Feb 04 22:14:42 node-0 salt-minion[15159]: [INFO    ] Added mine.update to scheduler
Feb 04 22:14:42 node-0 salt-minion[15159]: [INFO    ] Minion is starting as user 'root'
Feb 04 22:14:42 node-0 salt-minion[15159]: [INFO    ] Minion is ready to receive requests!
Feb 04 22:14:43 node-0 salt-minion[15159]: [INFO    ] Running scheduled job: __mine_interval with jid 20240204221443518611
Feb 04 22:14:43 node-0 salt-minion[15159]: [INFO    ] User sudo_vagrant Executing command saltutil.find_job with jid 20240204221443970560
Feb 04 22:14:44 node-0 salt-minion[15229]: [INFO    ] Starting a new job 20240204221443970560 with PID 15229
Feb 04 22:14:44 node-0 salt-minion[15229]: [INFO    ] Returning information for job: 20240204221443970560
Feb 04 22:14:58 node-0 salt-minion[15159]: [INFO    ] User sudo_vagrant Executing command test.ping with jid 20240204221458868531
Feb 04 22:14:58 node-0 salt-minion[15232]: [INFO    ] Starting a new job 20240204221458868531 with PID 15232
Feb 04 22:14:58 node-0 salt-minion[15232]: [INFO    ] Returning information for job: 20240204221458868531

CrackerJackMack avatar Feb 04 '24 22:02 CrackerJackMack

Update on my findings. This is the safest and most reliable way I've found to restart a linux minion. Using systemd-run --scope creates a new parent cGroup outside of init. This is similar to what is already implemented in systemd_service, but for some reason service.restart salt-minion, never results in the minion being restarted for me. minion.restart works with this. minion_restart_command: systemd-run --scope systemctl restart salt-minion

CrackerJackMack avatar Feb 06 '24 16:02 CrackerJackMack