salt icon indicating copy to clipboard operation
salt copied to clipboard

[BUG] minion crashes when package not managed by systemctl

Open dnessett opened this issue 6 months ago • 6 comments

Description When using the sls code given in the Setup section and the command also given in the Setup section, the identified minion crashes. This may have something to do with the package systemd, which is not managed by systemctl. (Ignore the title "restart_minion", since the code was taken and then modified to work on packages other than salt-minion without changing the title of the section).

Setup

# update-package.sls
# Check the version of a package and patch to the latest version.
# Sample command: salt <minionid> state.sls patch_pkg.sls pillar='{"package": "systemd"}' test=1
# running the above with "test=1" allows you to see if an update is needed for the package before actually updating it.
{% set package = salt['pillar.get']('package') %}

upgrade_{{ package }}:
  pkg.latest:
    - name: {{ package }}

restart_minion:
  service.running:
    - name: {{ package }}
    - watch:
      - upgrade_{{ package }}

and the following command:

sudo salt --timeout=600 'MOLS-H-03' state.sls update-package pillar='{"package": "systemd"}'

The minion does not return and the command times out. When checking the status of the minion on its machine, it seems the service.running code crashes the minion. However, when I run the command a second time, the minion does not crash.

Output from checking the status of the minion after running the command for the first time:

dnessett@MOLS-H-03:~$ sudo systemctl status salt-minion
[sudo] password for dnessett:             
× salt-minion.service - The Salt Minion
     Loaded: loaded (/lib/systemd/system/salt-minion.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Wed 2024-08-21 13:07:09 MDT; 16min ago
       Docs: man:salt-minion(1)
             file:///usr/share/doc/salt/html/contents.html
             https://docs.saltproject.io/en/latest/contents.html
    Process: 4795 ExecStart=/usr/bin/salt-minion (code=exited, status=1/FAILURE)
   Main PID: 4795 (code=exited, status=1/FAILURE)
        CPU: 943ms

Aug 21 13:07:08 MOLS-H-03 salt-minion[4054]: [ERROR   ] stderr: Running scope as unit: run-re2d3f7e5b7914a1dae1cae8845f>
Aug 21 13:07:08 MOLS-H-03 salt-minion[4054]: Failed to stop systemd.service: Unit systemd.service not loaded.
Aug 21 13:07:08 MOLS-H-03 salt-minion[4054]: [ERROR   ] retcode: 5
Aug 21 13:07:08 MOLS-H-03 salt-minion[4054]: [ERROR   ] Command '/usr/bin/systemd-run' failed with return code: 5
Aug 21 13:07:08 MOLS-H-03 salt-minion[4054]: [ERROR   ] stderr: Running scope as unit: run-r2ffa949b7afe4383b510e40ada9>
Aug 21 13:07:08 MOLS-H-03 salt-minion[4054]: Failed to start systemd.service: Unit systemd.service not found.
Aug 21 13:07:08 MOLS-H-03 salt-minion[4054]: [ERROR   ] retcode: 5
Aug 21 13:07:08 MOLS-H-03 salt-minion[4054]: [ERROR   ] Failed to start systemd.service: Unit systemd.service not found.
Aug 21 13:07:09 MOLS-H-03 systemd[1]: salt-minion.service: Main process exited, code=exited, status=1/FAILURE
Aug 21 13:07:09 MOLS-H-03 systemd[1]: salt-minion.service: Failed with result 'exit-code'.

Output on the master when running the command a second time, after restarting the minion using systemctl:

dnessett@homelserv:~$ sudo salt --timeout=120 'MOLS-H-03' state.sls update-package pillar='{"package": "systemd"}'
MOLS-H-03:
----------
          ID: upgrade_systemd
    Function: pkg.latest
        Name: systemd
      Result: True
     Comment: Package systemd is already up-to-date
     Started: 14:40:08.506796
    Duration: 3657.397 ms
     Changes:   
----------
          ID: restart_minion
    Function: service.running
        Name: systemd
      Result: False
     Comment: The named service systemd is not available
     Started: 14:40:12.166861
    Duration: 14.847 ms
     Changes:   

Summary for MOLS-H-03
------------
Succeeded: 1
Failed:    1
------------
Total states run:     2
Total run time:   3.672 s
ERROR: Minions returned with non-zero exit code

dnessett@homelserv:~$ sudo salt 'MOLS-H-03' test.ping
MOLS-H-03:
    True

This bug is hard to replicate, since once the command completes the first time, the systemd package is updated. Running the same command the second time does not crash the minion. Checking the status of the minion after running the command a second time shows it is up and active.

  • [X ] on-prem machine
  • [ ] VM (Virtualbox, KVM, etc. please specify)
  • [ ] VM running on a cloud service, please be explicit and add details
  • [ ] container (Kubernetes, Docker, containerd, etc. please specify)
  • [ ] or a combination, please be explicit
  • [ ] jails if it is FreeBSD
  • [ ] classic packaging
  • [ X] onedir packaging
  • [ ] used bootstrap to install

Steps to Reproduce the behavior It is necessary that the minion machine is running a version of systemd that is not the latest. One way to replicate (although I have not tried this) is to take a timeshift backup, run the update code, then restore from the backup.

Expected behavior I expected systemd to be updated and the state update to return with success. There is an easy way to work around this problem: Make the update specific to systemd and get rid of the section labeled restart_minion:. I haven't tried this workaround, but expect it would work.

Screenshots No Screenshots are relevant.

Versions Report

Master:

dnessett@homelserv:~$ sudo salt --versions-report
[sudo] password for dnessett:        
Salt Version:
          Salt: 3006.9
 
Python Version:
        Python: 3.10.14 (main, Jun 26 2024, 11:44:37) [GCC 11.2.0]
 
Dependency Versions:
          cffi: 1.14.6
      cherrypy: unknown
  cryptography: 42.0.5
      dateutil: 2.8.1
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.4
       libgit2: Not Installed
  looseversion: 1.0.2
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 22.0
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.19.1
        pygit2: Not Installed
  python-gnupg: 0.4.8
        PyYAML: 6.0.1
         PyZMQ: 23.2.0
        relenv: 0.17.0
         smmap: Not Installed
       timelib: 0.2.4
       Tornado: 4.5.3
           ZMQ: 4.3.4
 
System Versions:
          dist: linuxmint 21.3 virginia
        locale: utf-8
       machine: x86_64
       release: 6.5.0-44-generic
        system: Linux
       version: Linux Mint 21.3 virginia

Minion:

dnessett@homelserv:~$ sudo salt-run 'MOLS-H-03' --versions-report
Salt Version:
          Salt: 3006.9
 
Python Version:
        Python: 3.10.14 (main, Jun 26 2024, 11:44:37) [GCC 11.2.0]
 
Dependency Versions:
          cffi: 1.14.6
      cherrypy: 18.6.1
  cryptography: 42.0.5
      dateutil: 2.8.1
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.4
       libgit2: Not Installed
  looseversion: 1.0.2
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 22.0
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.19.1
        pygit2: Not Installed
  python-gnupg: 0.4.8
        PyYAML: 6.0.1
         PyZMQ: 23.2.0
        relenv: 0.17.0
         smmap: Not Installed
       timelib: 0.2.4
       Tornado: 4.5.3
           ZMQ: 4.3.4
 
System Versions:
          dist: linuxmint 21.3 virginia
        locale: utf-8
       machine: x86_64
       release: 6.5.0-44-generic
        system: Linux
       version: Linux Mint 21.3 virginia
N/A

Additional context Add any other context about the problem here.

dnessett avatar Aug 21 '24 21:08 dnessett