salt
salt copied to clipboard
[BUG] Service not restarted despite of watching its service file
Description
- I have a
file.managed
task to deploy a systemd service unit file. - I have a
service.running
task with awatch
for the previous task. - Salt does report changes on the service unit file, however the service is not restarted.
Setup
node_exporter:
pkg.installed:
- pkgs:
- golang-github-prometheus-node-exporter
file.managed:
- name: /etc/systemd/system/node_exporter.service
- source: salt://node_exporter/files/node_exporter.service
service.running:
- enable: true
- watch:
- pkg: node_exporter
- file: node_exporter
Steps to Reproduce the behavior
[root@salt-master ~]# salt salt-master.mydomain.com state.apply node_exporter
salt-master.mydomain.com:
----------
ID: node_exporter
Function: pkg.installed
Result: True
Comment: 1 targeted package was installed/updated.
Started: 16:25:13.246211
Duration: 54466.506 ms
Changes:
----------
golang-github-prometheus-node-exporter:
----------
new:
1.2.2-1.el7
old:
----------
ID: node_exporter
Function: file.managed
Name: /etc/systemd/system/node_exporter.service
Result: True
Comment: File /etc/systemd/system/node_exporter.service updated
Started: 16:26:07.738420
Duration: 559.651 ms
Changes:
----------
diff:
---
+++
@@ -7,7 +7,7 @@
Type=simple
NotifyAccess=all
User=prometheus
-ExecStart=/opt/node_exporter-1.3.1.linux-amd64/node_exporter \
+ExecStart=/usr/sbin/node_exporter \
--collector.textfile.directory=/var/lib/node_exporter
[Install]
----------
ID: node_exporter
Function: service.running
Result: True
Comment: Service node_exporter has been enabled, and is in the desired state
Started: 16:26:08.545219
Duration: 4213.354 ms
Changes:
----------
node_exporter:
True
Summary for salt-master.mydomain.com
------------
Succeeded: 3 (changed=3)
Failed: 0
------------
Total states run: 3
Total run time: 59.721 s
Expected behavior
Since the service.running
task watches the file.managed
one, and the later is reporting changes, I was expecting the service to be restarted.
Versions Report
salt --versions-report
(Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)Salt Version:
Salt: 3005.1
Dependency Versions:
cffi: Not Installed
cherrypy: Not Installed
dateutil: 2.8.2
docker-py: 2.6.1
gitdb: Not Installed
gitpython: Not Installed
Jinja2: 2.11.1
libgit2: Not Installed
M2Crypto: 0.35.2
Mako: Not Installed
msgpack: 0.6.1
msgpack-pure: Not Installed
mysql-python: 1.3.12
pycparser: Not Installed
pycrypto: Not Installed
pycryptodome: Not Installed
pygit2: Not Installed
Python: 3.6.8 (default, Apr 2 2020, 13:34:55)
python-gnupg: Not Installed
PyYAML: 3.13
PyZMQ: 18.0.1
smmap: Not Installed
timelib: Not Installed
Tornado: 4.5.3
ZMQ: 4.1.4
System Versions:
dist: centos 7 Core
locale: UTF-8
machine: x86_64
release: 3.10.0-1127.19.1.el7.x86_64
system: Linux
version: CentOS Linux 7 Core
Additional context
- Previously, Node Exporter was being installed by unpacking its tarball. That's why in the systemd service unit file the
ExecStart
command had/opt/node_exporter-1.3.1.linux-amd64/node_exporter
. - Now, I'm installing Node Exporter using an RPM package, which places the executable file in
/usr/sbin/node_exporter
, hence the changes to the systemd service unit file. - I suspect the problem must be related to the installation of that package. For instance, if I install the package manually so the
pkg.install
task does not do anything, then Salt behaves correctly: it changes the service file and due to thewatch
it restarts the service. See:
[root@salt-master ~]# salt salt-master.mydomain.com state.apply node_exporter
salt-master.mydomain.com:
Name: node_exporter - Function: pkg.installed - Result: Clean Started: - 11:22:23.643942 Duration: 11577.661 ms
----------
ID: node_exporter
Function: file.managed
Name: /etc/systemd/system/node_exporter.service
Result: True
Comment: File /etc/systemd/system/node_exporter.service updated
Started: 11:22:35.240404
Duration: 345.054 ms
Changes:
----------
diff:
---
+++
@@ -7,7 +7,7 @@
Type=simple
NotifyAccess=all
User=prometheus
-ExecStart=/opt/node_exporter-1.3.1.linux-amd64/node_exporter \
+ExecStart=/usr/sbin/node_exporter \
--collector.textfile.directory=/var/lib/node_exporter
[Install]
----------
ID: node_exporter
Function: service.running
Result: True
Comment: Service restarted
Started: 11:22:39.215246
Duration: 8927.417 ms
Changes:
----------
node_exporter:
True
Summary for salt-master.mydomain.com
------------
Succeeded: 3 (changed=2)
Failed: 0
------------
Total states run: 3
Total run time: 21.593 s
I'm not sure if this is relevant, but that RPM package has a bug: it installs a node_exporter.service
file in /etc/systemd/system/
, which clashes with the one that I'm deploying with the file.managed
task. However, my file is not overwritten because rpm
detects the conflict and installs the new file as node_exporter.service.rpmnew
. But I don't know if that can confuse Salt in some way.
Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey. Please be sure to review our Code of Conduct. Also, check out some of our community resources including:
- Community Wiki
- Salt’s Contributor Guide
- Join our Community Slack
- IRC on LiberaChat
- Salt Project YouTube channel
- Salt Project Twitch channel
There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar. If you have additional questions, email us at [email protected]. We’re glad you’ve joined our community and look forward to doing awesome things with you!
N.B. 3004.1 has security vulnerabilities. You should upgrade to 3004.2 or 3005.1
That is odd behaviour. Does it make any difference if you:
- use
- file: /etc/systemd/system/node_exporter.service
? - split the states to multiple ids?
@OrangeDog,
Thanks for your answer. I have tested your suggestions and they didn't make any difference.
use - file: /etc/systemd/system/node_exporter.service?
[root@salt-master ~]$ sudo salt -I role:salt-master state.apply prometheus.node_exporter
salt-master.mydomain.com:
ID: node_exporter
Function: pkg.installed
Result: True
Comment: 1 targeted package was installed/updated.
Started: 10:26:27.263121
Duration: 55846.108 ms
Changes:
----------
golang-github-prometheus-node-exporter:
----------
new:
1.2.2-1.el7
old:
----------
ID: node_exporter
Function: file.managed
Name: /etc/systemd/system/node_exporter.service
Result: True
Comment: File /etc/systemd/system/node_exporter.service updated
Started: 10:27:23.125982
Duration: 578.162 ms
Changes:
----------
diff:
---
+++
@@ -7,7 +7,7 @@
Type=simple
NotifyAccess=all
User=prometheus
-ExecStart=/opt/node_exporter-1.3.1.linux-amd64/node_exporter \
+ExecStart=/usr/sbin/node_exporter \
--collector.textfile.directory=/var/lib/node_exporter
[Install]
----------
ID: node_exporter
Function: service.running
Result: True
Comment: Service node_exporter has been enabled, and is in the desired state
Started: 10:27:23.913031
Duration: 3923.308 ms
Changes:
----------
node_exporter:
True
Summary for salt-master.mydomain.com
------------
Succeeded: 3 (changed=3)
Failed: 0
------------
Total states run: 3
Total run time: 60.746 s
[root@salt-master ~]$ grep ExecStart /etc/systemd/system/node_exporter.service
ExecStart=/usr/sbin/node_exporter \
[root@salt-master ~]$ systemctl status node_exporter
● node_exporter.service - Daemon for Prometheus node_exporter
Loaded: loaded (/etc/systemd/system/node_exporter.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2022-11-07 10:22:56 JST; 5min ago
Main PID: 160295 (node_exporter)
CGroup: /system.slice/node_exporter.service
└─160295 /opt/node_exporter-1.3.1.linux-amd64/node_exporter --collector.textfile.directory=/var/lib/node_exporter
[root@salt-master ~]$
split the states to multiple ids?
[root@salt-master ~]$ sudo salt -I role:salt-master state.apply prometheus.node_exporter
salt-master.mydomain.com:
ID: node-exporter-package
Function: pkg.installed
Result: True
Comment: 1 targeted package was installed/updated.
Started: 10:41:47.141779
Duration: 51619.591 ms
Changes:
----------
golang-github-prometheus-node-exporter:
----------
new:
1.2.2-1.el7
old:
----------
ID: node-exporter-service-file
Function: file.managed
Name: /etc/systemd/system/node_exporter.service
Result: True
Comment: File /etc/systemd/system/node_exporter.service updated
Started: 10:42:38.773906
Duration: 302.168 ms
Changes:
----------
diff:
---
+++
@@ -7,7 +7,7 @@
Type=simple
NotifyAccess=all
User=prometheus
-ExecStart=/opt/node_exporter-1.3.1.linux-amd64/node_exporter \
+ExecStart=/usr/sbin/node_exporter \
--collector.textfile.directory=/var/lib/node_exporter
[Install]
----------
ID: node-exporter-service
Function: service.running
Name: node_exporter
Result: True
Comment: Service node_exporter has been enabled, and is in the desired state
Started: 10:42:39.188802
Duration: 3908.323 ms
Changes:
----------
node_exporter:
True
Summary for salt-master.mydomain.com
------------
Succeeded: 3 (changed=3)
Failed: 0
------------
Total states run: 3
Total run time: 56.052 s
[root@salt-master ~]$ grep ExecStart /etc/systemd/system/node_exporter.service
ExecStart=/usr/sbin/node_exporter \
[root@salt-master ~]$ systemctl status node_exporter
● node_exporter.service - Daemon for Prometheus node_exporter
Loaded: loaded (/etc/systemd/system/node_exporter.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2022-11-07 10:36:23 JST; 6min ago
Main PID: 1782 (node_exporter)
CGroup: /system.slice/node_exporter.service
└─1782 /opt/node_exporter-1.3.1.linux-amd64/node_exporter --collector.textfile.directory=/var/lib/node_exporter
[root@salt-master ~]$
N.B. 3004.1 has security vulnerabilities. You should upgrade to 3004.2 or 3005.1
I have updated to 3005.1 and it didn't make any difference. The behavior is exactly the same.
I have debugged this a bit.
In both test mode and normal mode the call to _check_for_unit_changes is successful and systemctl_reload()
is performed. So, Salt is successfully detecting there have been changes in the service unit file.
When running in normal mode, the mod_watch function (which would trigger the service restart) is never called. Surprisingly, when running in test mode, mod_watch
is called.
I think this is because the _enable function is called in this way...
https://github.com/saltstack/salt/blob/v3005.1/salt/states/service.py#L508
...given that before_toggle_status
is True
and before_toggle_enable_status
is False
.
When the _enable
function is called in that way, the code flow falls into this block of code:
https://github.com/saltstack/salt/blob/v3005.1/salt/states/service.py#L156-L160
I hope this helps a Salt developer to further debug this issue.
I don't think this is a bug, the salt documentation mentions the following
If the watching state changes key contains values, then mod_watch will not be called. If you're using watch or watch_in then it's a good idea to have a state that only enforces one attribute - such as splitting out service.running into its own state and have service.enabled in another.
Your output indicates that the service.running
state enabled the service, thus watch is not called by design.
Can confirm, watch gets ignored if we ask for enabled:true and the service is disabled before the state run.
This should be fixed in 3006.9, can you verify if that the bug is resolved now?