salt icon indicating copy to clipboard operation
salt copied to clipboard

[BUG] [3007.1] startup_states: highstate stop working

Open momolemo opened this issue 1 year ago • 6 comments

Description After update from 3007 to 3007.1 the startup_state: highstate option stop working. We find in the service salt-minion log the following error message.

mai 24 12:29:18 server salt-minion[503062]: /opt/saltstack/salt/lib/python3.10/site-packages/salt/minion.py:2434: RuntimeWarning: coroutine 'Minion._handle_decoded_payload' was never awaited
mai 24 12:29:18 server salt-minion[503062]:   self._handle_decoded_payload(data)
mai 24 12:29:18 server salt-minion[503062]: RuntimeWarning: Enable tracemalloc to get the object allocation traceback

Setup (Please provide relevant configs and/or SLS files (be sure to remove sensitive info. There is no general set-up of Salt.)

Please be as specific as possible and give set-up details.

  • [ ] on-prem machine
  • [x] VM (Virtualbox, KVM, etc. please specify) ==> LXC
  • [ ] VM running on a cloud service, please be explicit and add details
  • [ ] container (Kubernetes, Docker, containerd, etc. please specify)
  • [ ] or a combination, please be explicit
  • [ ] jails if it is FreeBSD
  • [ ] classic packaging
  • [x] onedir packaging
  • [x] used bootstrap to install

Steps to Reproduce the behavior in 3007.1 set the config minion to statup_states: highstate and restart minions

Expected behavior A salt highstate apply automatcaly

Screenshots N/A

Versions Report

salt --versions-report

Same version on master and minion (test cluster)

Salt Version:
          Salt: 3007.1
 
Python Version:
        Python: 3.10.14 (main, Apr  3 2024, 21:30:09) [GCC 11.2.0]
 
Dependency Versions:
          cffi: 1.16.0
      cherrypy: unknown
      dateutil: 2.8.2
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.4
       libgit2: Not Installed
  looseversion: 1.3.0
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.7
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 23.1
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.19.1
        pygit2: Not Installed
  python-gnupg: 0.5.2
        PyYAML: 6.0.1
         PyZMQ: 25.1.2
        relenv: 0.16.0
         smmap: Not Installed
       timelib: 0.3.0
       Tornado: 6.3.3
           ZMQ: 4.3.4
 
Salt Extensions:
 saltext.vault: 1.0.0
 
Salt Package Information:
  Package Type: onedir
 
System Versions:
          dist: debian 11.9 bullseye
        locale: utf-8
       machine: x86_64
       release: 5.10.0-27-cloud-amd64
        system: Linux
       version: Debian GNU/Linux 11.9 bullseye

Additional context Add any other context about the problem here.

We search which PR can be responsible of this behavior and we find this one that seems to us possible cause but we cant be sure because of our dev level too low

https://github.com/saltstack/salt/commit/0ff43842cf0141c946c58ac9b3af18ca744c0ff9

momolemo avatar May 24 '24 14:05 momolemo

I can confirm this happening on all our minions (Debian and Windows) after having upgraded to 3007.1.

The log contains the same message as reported above, enabling tracemalloc reveals the corresponding code line:

root@minion ~ 
# export PYTHONTRACEMALLOC=1
root@minion ~ 
# salt-minion -l debug
[ ... some more debug messages ...]
[INFO    ] Minion is ready to receive requests!
/opt/saltstack/salt/lib/python3.10/site-packages/salt/minion.py:2434: RuntimeWarning: coroutine 'Minion._handle_decoded_payload' was never awaited
  self._handle_decoded_payload(data)
Object allocated at (most recent call last):
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/minion.py", lineno 2434
    self._handle_decoded_payload(data)

The corresponding code block https://github.com/saltstack/salt/blob/0ff43842cf0141c946c58ac9b3af18ca744c0ff9/salt/minion.py#L2434 parses the startup_states option and fails to do so because of the missing await. Adding await requires making the calling functions async. By manually fixing /opt/saltstack/salt/lib/python3.10/site-packages/salt/minion.py with the changes below highstating works as expected.

Changes:

diff --git a/minion_old.py b/minion.py
index 68733a5..a4ebbe3 100644
--- a/minion_old.py
+++ b/minion.py
@@ -1144,7 +1144,7 @@ class MinionManager(MinionBase):
                     minion.setup_scheduler(before_connect=True)
                 if minion.opts.get("master_type", "str") != "disable":
                     await minion.connect_master(failed=failed)
-                minion.tune_in(start=False)
+                await minion.tune_in(start=False)
                 self.minions.append(minion)
                 break
             except SaltClientError as exc:
@@ -2406,7 +2406,7 @@ class Minion(MinionBase):
         log.trace("ret_val = %s", ret_val)  # pylint: disable=no-member
         return ret_val
 
-    def _state_run(self):
+    async def _state_run(self):
         """
         Execute a state run based on information set in the minion config file
         """
@@ -2431,7 +2431,7 @@ class Minion(MinionBase):
                 else:
                     data["fun"] = "state.highstate"
                     data["arg"] = []
-                self._handle_decoded_payload(data)
+                await self._handle_decoded_payload(data)
 
     def _refresh_grains_watcher(self, refresh_interval_in_minutes):
         """
@@ -3110,7 +3110,7 @@ class Minion(MinionBase):
         return True
 
     # Main Minion Tune In
-    def tune_in(self, start=True):
+    async def tune_in(self, start=True):
         """
         Lock onto the publisher. This is the main event loop for the minion
         :rtype : None
@@ -3137,7 +3137,7 @@ class Minion(MinionBase):
             salt.utils.win_functions.enable_ctrl_logoff_handler()
 
         # On first startup execute a state run if configured to do so
-        self._state_run()
+        await self._state_run()
 
         self.setup_beacons()
         self.setup_scheduler()

transistortim avatar May 27 '24 09:05 transistortim

Hi there,

we also have the same issue with the version 3007.1 on all of our minions. startup_states: highstate is not working anymore.

iMikeG6 avatar May 30 '24 08:05 iMikeG6

If anyone is looking for a workaround: We enabled salt's reactor with the example from the docs adapted slightly.

root@master ~
# cat /etc/salt/master.d/reactor.conf 
reactor:
  - 'salt/minion/*/start':          # Match tag "salt/minion/*/start"
    - /srv/reactor/start.sls        # Things to do when a minion starts
root@master ~
# cat /srv/reactor/start.sls 
highstate_run:
  local.state.apply:
    - tgt: {{ data['id'] }}

transistortim avatar Jun 03 '24 08:06 transistortim

Thanks @transistortim,

this also worked for us.

Nice workaround.

If anyone is looking for a workaround: We enabled salt's reactor with the example from the docs adapted slightly.

root@master ~
# cat /etc/salt/master.d/reactor.conf 
reactor:
  - 'salt/minion/*/start':          # Match tag "salt/minion/*/start"
    - /srv/reactor/start.sls        # Things to do when a minion starts
root@master ~
# cat /srv/reactor/start.sls 
highstate_run:
  local.state.apply:
    - tgt: {{ data['id'] }}

iMikeG6 avatar Jun 04 '24 09:06 iMikeG6

Can confirm that this also happens when using startup_states: 'sls'. I'm using 3007.1

jfduque avatar Jun 20 '24 04:06 jfduque

Confirmed here too. Which, in our case, means that new machines will not be configured properly and that messes up our deployments.

bgdnlp avatar Jun 25 '24 08:06 bgdnlp