salt
salt copied to clipboard
[BUG] [3007.1] startup_states: highstate stop working
Description After update from 3007 to 3007.1 the startup_state: highstate option stop working. We find in the service salt-minion log the following error message.
mai 24 12:29:18 server salt-minion[503062]: /opt/saltstack/salt/lib/python3.10/site-packages/salt/minion.py:2434: RuntimeWarning: coroutine 'Minion._handle_decoded_payload' was never awaited
mai 24 12:29:18 server salt-minion[503062]: self._handle_decoded_payload(data)
mai 24 12:29:18 server salt-minion[503062]: RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Setup (Please provide relevant configs and/or SLS files (be sure to remove sensitive info. There is no general set-up of Salt.)
Please be as specific as possible and give set-up details.
- [ ] on-prem machine
- [x] VM (Virtualbox, KVM, etc. please specify) ==> LXC
- [ ] VM running on a cloud service, please be explicit and add details
- [ ] container (Kubernetes, Docker, containerd, etc. please specify)
- [ ] or a combination, please be explicit
- [ ] jails if it is FreeBSD
- [ ] classic packaging
- [x] onedir packaging
- [x] used bootstrap to install
Steps to Reproduce the behavior in 3007.1 set the config minion to statup_states: highstate and restart minions
Expected behavior A salt highstate apply automatcaly
Screenshots N/A
Versions Report
salt --versions-report
Same version on master and minion (test cluster)
Salt Version:
Salt: 3007.1
Python Version:
Python: 3.10.14 (main, Apr 3 2024, 21:30:09) [GCC 11.2.0]
Dependency Versions:
cffi: 1.16.0
cherrypy: unknown
dateutil: 2.8.2
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
Jinja2: 3.1.4
libgit2: Not Installed
looseversion: 1.3.0
M2Crypto: Not Installed
Mako: Not Installed
msgpack: 1.0.7
msgpack-pure: Not Installed
mysql-python: Not Installed
packaging: 23.1
pycparser: 2.21
pycrypto: Not Installed
pycryptodome: 3.19.1
pygit2: Not Installed
python-gnupg: 0.5.2
PyYAML: 6.0.1
PyZMQ: 25.1.2
relenv: 0.16.0
smmap: Not Installed
timelib: 0.3.0
Tornado: 6.3.3
ZMQ: 4.3.4
Salt Extensions:
saltext.vault: 1.0.0
Salt Package Information:
Package Type: onedir
System Versions:
dist: debian 11.9 bullseye
locale: utf-8
machine: x86_64
release: 5.10.0-27-cloud-amd64
system: Linux
version: Debian GNU/Linux 11.9 bullseye
Additional context Add any other context about the problem here.
We search which PR can be responsible of this behavior and we find this one that seems to us possible cause but we cant be sure because of our dev level too low
https://github.com/saltstack/salt/commit/0ff43842cf0141c946c58ac9b3af18ca744c0ff9
I can confirm this happening on all our minions (Debian and Windows) after having upgraded to 3007.1.
The log contains the same message as reported above, enabling tracemalloc reveals the corresponding code line:
root@minion ~
# export PYTHONTRACEMALLOC=1
root@minion ~
# salt-minion -l debug
[ ... some more debug messages ...]
[INFO ] Minion is ready to receive requests!
/opt/saltstack/salt/lib/python3.10/site-packages/salt/minion.py:2434: RuntimeWarning: coroutine 'Minion._handle_decoded_payload' was never awaited
self._handle_decoded_payload(data)
Object allocated at (most recent call last):
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/minion.py", lineno 2434
self._handle_decoded_payload(data)
The corresponding code block https://github.com/saltstack/salt/blob/0ff43842cf0141c946c58ac9b3af18ca744c0ff9/salt/minion.py#L2434 parses the startup_states option and fails to do so because of the missing await.
Adding await requires making the calling functions async. By manually fixing /opt/saltstack/salt/lib/python3.10/site-packages/salt/minion.py with the changes below highstating works as expected.
Changes:
diff --git a/minion_old.py b/minion.py
index 68733a5..a4ebbe3 100644
--- a/minion_old.py
+++ b/minion.py
@@ -1144,7 +1144,7 @@ class MinionManager(MinionBase):
minion.setup_scheduler(before_connect=True)
if minion.opts.get("master_type", "str") != "disable":
await minion.connect_master(failed=failed)
- minion.tune_in(start=False)
+ await minion.tune_in(start=False)
self.minions.append(minion)
break
except SaltClientError as exc:
@@ -2406,7 +2406,7 @@ class Minion(MinionBase):
log.trace("ret_val = %s", ret_val) # pylint: disable=no-member
return ret_val
- def _state_run(self):
+ async def _state_run(self):
"""
Execute a state run based on information set in the minion config file
"""
@@ -2431,7 +2431,7 @@ class Minion(MinionBase):
else:
data["fun"] = "state.highstate"
data["arg"] = []
- self._handle_decoded_payload(data)
+ await self._handle_decoded_payload(data)
def _refresh_grains_watcher(self, refresh_interval_in_minutes):
"""
@@ -3110,7 +3110,7 @@ class Minion(MinionBase):
return True
# Main Minion Tune In
- def tune_in(self, start=True):
+ async def tune_in(self, start=True):
"""
Lock onto the publisher. This is the main event loop for the minion
:rtype : None
@@ -3137,7 +3137,7 @@ class Minion(MinionBase):
salt.utils.win_functions.enable_ctrl_logoff_handler()
# On first startup execute a state run if configured to do so
- self._state_run()
+ await self._state_run()
self.setup_beacons()
self.setup_scheduler()
Hi there,
we also have the same issue with the version 3007.1 on all of our minions. startup_states: highstate is not working anymore.
If anyone is looking for a workaround: We enabled salt's reactor with the example from the docs adapted slightly.
root@master ~
# cat /etc/salt/master.d/reactor.conf
reactor:
- 'salt/minion/*/start': # Match tag "salt/minion/*/start"
- /srv/reactor/start.sls # Things to do when a minion starts
root@master ~
# cat /srv/reactor/start.sls
highstate_run:
local.state.apply:
- tgt: {{ data['id'] }}
Thanks @transistortim,
this also worked for us.
Nice workaround.
If anyone is looking for a workaround: We enabled salt's reactor with the example from the docs adapted slightly.
root@master ~ # cat /etc/salt/master.d/reactor.conf reactor: - 'salt/minion/*/start': # Match tag "salt/minion/*/start" - /srv/reactor/start.sls # Things to do when a minion starts root@master ~ # cat /srv/reactor/start.sls highstate_run: local.state.apply: - tgt: {{ data['id'] }}
Can confirm that this also happens when using startup_states: 'sls'. I'm using 3007.1
Confirmed here too. Which, in our case, means that new machines will not be configured properly and that messes up our deployments.