salt
salt copied to clipboard
[BUG] Without use_master_when_local=True minion ignores master parameter
In this config:
file_client: local
master: 1.2.3.4
use_master_when_local: False
minion is tries to connect to 127.0.0.1:
21:19:31 - salt.minion:245 - DEBUG - Master URI: tcp://127.0.0.1:4506
But if I set use_master_when_local: True
minion starts to connect to the master:
21:22:58 - salt.transport.zeromq:258 - DEBUG - Connecting the Minion to the Master URI (for the return server): tcp://1.2.3.4:4506
21:22:58 - salt.transport.zeromq:1300 - DEBUG - Trying to connect to: tcp://1.2.3.4:4506
Moreover setting use_master_when_local: True
when master_type: disabled
always produces and error:
root@devmax:/srv/salt# salt-minion
21:53:35 - salt.minion:542 - WARNING - Master is set to disable, skipping connection
21:53:39 - salt.minion:1949 - WARNING - The minion function caused an exception
Exception ignored in: <bound method AsyncZeroMQReqChannel.__del__ of <salt.transport.zeromq.AsyncZeroMQReqChannel object at 0x7fdb287b1a90>>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/salt/transport/zeromq.py", line 303, in __del__
with self._refcount_lock:
AttributeError: 'AsyncZeroMQReqChannel' object has no attribute '_refcount_lock'
Exception ignored in: <bound method RemotePillar.__del__ of <salt.pillar.RemotePillar object at 0x7fdb28690a90>>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/salt/pillar/__init__.py", line 358, in __del__
self.destroy()
File "/usr/local/lib/python3.6/dist-packages/salt/pillar/__init__.py", line 354, in destroy
self.channel.close()
AttributeError: 'RemotePillar' object has no attribute 'channel'
As for the config:
file_client: local
master: 1.2.3.4
use_master_when_local: False
This is confirmed.
When we set use_master_when_local: True
, as expected, it tries to connect to the master set in the config.
When setting use_master_when_local: True
and master_type: disabled
we get [ERROR ] Invalid keyword 'disabled' for variable 'master_type'
When setting use_master_when_local: True
and master_type: disable
, the salt-minion starts:
/testing # salt-minion
[WARNING ] Error loading grains, unexpected linux_gpu_data output, check that you have a valid shell configured and permissions to run lspci command
[WARNING ] Master is set to disable, skipping connection
[WARNING ] Error loading grains, unexpected linux_gpu_data output, check that you have a valid shell configured and permissions to run lspci command
However, calling salt-call we get a similar error(with or without --local
):
/testing # salt-call test.ping
[WARNING ] Error loading grains, unexpected linux_gpu_data output, check that you have a valid shell configured and permissions to run lspci command
[WARNING ] Master is set to disable, skipping connection
[ERROR ] An un-handled exception was caught by salt's global exception handler:
KeyError: 'master_uri'
Traceback (most recent call last):
File "/usr/local/bin/salt-call", line 8, in <module>
sys.exit(salt_call())
File "/usr/local/lib/python3.7/site-packages/salt/scripts.py", line 472, in salt_call
client.run()
File "/usr/local/lib/python3.7/site-packages/salt/cli/call.py", line 48, in run
caller = salt.cli.caller.Caller.factory(self.config)
File "/usr/local/lib/python3.7/site-packages/salt/cli/caller.py", line 64, in factory
return ZeroMQCaller(opts, **kwargs)
File "/usr/local/lib/python3.7/site-packages/salt/cli/caller.py", line 329, in __init__
super(ZeroMQCaller, self).__init__(opts)
File "/usr/local/lib/python3.7/site-packages/salt/cli/caller.py", line 89, in __init__
self.minion = salt.minion.SMinion(opts)
File "/usr/local/lib/python3.7/site-packages/salt/minion.py", line 922, in __init__
self.gen_modules(initial_load=True, context=context or {})
File "/usr/local/lib/python3.7/site-packages/salt/minion.py", line 456, in gen_modules
pillarenv=self.opts.get("pillarenv"),
File "/usr/local/lib/python3.7/site-packages/salt/pillar/__init__.py", line 101, in get_pillar
extra_minion_data=extra_minion_data,
File "/usr/local/lib/python3.7/site-packages/salt/pillar/__init__.py", line 301, in __init__
self.channel = salt.transport.client.ReqChannel.factory(opts)
File "/usr/local/lib/python3.7/site-packages/salt/transport/client.py", line 28, in factory
AsyncReqChannel.factory, (opts,), kwargs, loop_kwarg="io_loop",
File "/usr/local/lib/python3.7/site-packages/salt/utils/asynchronous.py", line 70, in __init__
self.obj = cls(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/salt/transport/client.py", line 133, in factory
return salt.transport.zeromq.AsyncZeroMQReqChannel(opts, **kwargs)
File "/usr/local/lib/python3.7/site-packages/salt/transport/zeromq.py", line 178, in __new__
obj.__singleton_init__(opts, **kwargs)
File "/usr/local/lib/python3.7/site-packages/salt/transport/zeromq.py", line 255, in __singleton_init__
self.auth = salt.crypt.AsyncAuth(self.opts, io_loop=self._io_loop)
File "/usr/local/lib/python3.7/site-packages/salt/crypt.py", line 491, in __new__
key = cls.__key(opts)
File "/usr/local/lib/python3.7/site-packages/salt/crypt.py", line 510, in __key
opts["master_uri"], # master ID
KeyError: 'master_uri'
Traceback (most recent call last):
File "/usr/local/bin/salt-call", line 8, in <module>
sys.exit(salt_call())
File "/usr/local/lib/python3.7/site-packages/salt/scripts.py", line 472, in salt_call
client.run()
File "/usr/local/lib/python3.7/site-packages/salt/cli/call.py", line 48, in run
caller = salt.cli.caller.Caller.factory(self.config)
File "/usr/local/lib/python3.7/site-packages/salt/cli/caller.py", line 64, in factory
return ZeroMQCaller(opts, **kwargs)
File "/usr/local/lib/python3.7/site-packages/salt/cli/caller.py", line 329, in __init__
super(ZeroMQCaller, self).__init__(opts)
File "/usr/local/lib/python3.7/site-packages/salt/cli/caller.py", line 89, in __init__
self.minion = salt.minion.SMinion(opts)
File "/usr/local/lib/python3.7/site-packages/salt/minion.py", line 922, in __init__
self.gen_modules(initial_load=True, context=context or {})
File "/usr/local/lib/python3.7/site-packages/salt/minion.py", line 456, in gen_modules
pillarenv=self.opts.get("pillarenv"),
File "/usr/local/lib/python3.7/site-packages/salt/pillar/__init__.py", line 101, in get_pillar
extra_minion_data=extra_minion_data,
File "/usr/local/lib/python3.7/site-packages/salt/pillar/__init__.py", line 301, in __init__
self.channel = salt.transport.client.ReqChannel.factory(opts)
File "/usr/local/lib/python3.7/site-packages/salt/transport/client.py", line 28, in factory
AsyncReqChannel.factory, (opts,), kwargs, loop_kwarg="io_loop",
File "/usr/local/lib/python3.7/site-packages/salt/utils/asynchronous.py", line 70, in __init__
self.obj = cls(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/salt/transport/client.py", line 133, in factory
return salt.transport.zeromq.AsyncZeroMQReqChannel(opts, **kwargs)
File "/usr/local/lib/python3.7/site-packages/salt/transport/zeromq.py", line 178, in __new__
obj.__singleton_init__(opts, **kwargs)
File "/usr/local/lib/python3.7/site-packages/salt/transport/zeromq.py", line 255, in __singleton_init__
self.auth = salt.crypt.AsyncAuth(self.opts, io_loop=self._io_loop)
File "/usr/local/lib/python3.7/site-packages/salt/crypt.py", line 491, in __new__
key = cls.__key(opts)
File "/usr/local/lib/python3.7/site-packages/salt/crypt.py", line 510, in __key
opts["master_uri"], # master ID
KeyError: 'master_uri'
Exception ignored in: <function AsyncZeroMQReqChannel.__del__ at 0x7f38dde099e0>
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/salt/transport/zeromq.py", line 303, in __del__
with self._refcount_lock:
AttributeError: 'AsyncZeroMQReqChannel' object has no attribute '_refcount_lock'
Exception ignored in: <function RemotePillar.__del__ at 0x7f38de0da290>
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/salt/pillar/__init__.py", line 358, in __del__
self.destroy()
File "/usr/local/lib/python3.7/site-packages/salt/pillar/__init__.py", line 354, in destroy
self.channel.close()
AttributeError: 'RemotePillar' object has no attribute 'channel'
@litnimax Great talk at SaltConf! Glancing at this - I wonder if changing File "/usr/local/lib/python3.7/site-packages/salt/crypt.py", line 510, in __key opts["master_uri"], # master ID KeyError: 'master_uri'
to opts.get('master_uri')
would help.
Though I expect the answer is "no" - this is probably a lot larger issue 🤔
Thanks!!! I will test
the Core team won't be able to get to this in Aluminium, we will review any PRs submitted
Just ran into this, but in an even worse way that killed my dev server :(
With this config:
# Masterless Minion
master_type: disable
file_client: local
file_roots:
base:
- /srv/salt/state
pillar_roots:
base:
- /srv/salt/pillar
fileserver_backend:
- roots
metadata_server_grains: True
And then I (accidentally) forgot to disable the salt-minion systemd service while using my masterless minion. My /var/log/salt/minion
file quickly filled up my ~30GB root disk in about 2.5 hours.
# ls -lh /var/log/salt/minion
-rw-r----- 1 root root 29G Sep 20 18:44 /var/log/salt/minion
Repeated ad-infinitum:
2022-09-20 16:16:36,397 [salt.minion :534 ][WARNING ][627] Master is set to disable, skipping connection
2022-09-20 16:16:36,397 [salt.minion :1160][CRITICAL][627] Unexpected error while connecting to salt
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/salt/minion.py", line 1134, in _connect_minion
yield minion.connect_master(failed=failed)
File "/usr/lib/python3/dist-packages/salt/ext/tornado/gen.py", line 1056, in run
value = future.result()
File "/usr/lib/python3/dist-packages/salt/ext/tornado/concurrent.py", line 249, in result
raise_exc_info(self._exc_info)
File "<string>", line 4, in raise_exc_info
File "/usr/lib/python3/dist-packages/salt/ext/tornado/gen.py", line 1070, in run
yielded = self.gen.send(value)
File "/usr/lib/python3/dist-packages/salt/minion.py", line 1365, in connect_master
self.req_channel = salt.transport.client.AsyncReqChannel.factory(
File "/usr/lib/python3/dist-packages/salt/transport/client.py", line 83, in factory
return salt.channel.client.AsyncReqChannel.factory(opts, **kwargs)
File "/usr/lib/python3/dist-packages/salt/channel/client.py", line 127, in factory
auth = salt.crypt.AsyncAuth(opts, io_loop=io_loop)
File "/usr/lib/python3/dist-packages/salt/crypt.py", line 506, in __new__
key = cls.__key(opts)
File "/usr/lib/python3/dist-packages/salt/crypt.py", line 525, in __key
opts["master_uri"], # master ID
KeyError: 'master_uri'
~20 million exceptions between 2022-09-20 16:16:36,385
(first log entry) and 2022-09-20 18:44:12,741
(last log entry) - about 2.5 hours.
# grep 'opts\["master_uri"\]' /var/log/salt/minion | wc -l
20623786
This works:
diff --git a/salt/minion.py b/salt/minion.py
index cecc4f4adf..40a1584f06 100644
--- a/salt/minion.py
+++ b/salt/minion.py
@@ -533,7 +533,7 @@ class MinionBase:
if opts["master_type"] == "disable":
log.warning("Master is set to disable, skipping connection")
self.connected = False
- raise salt.ext.tornado.gen.Return((None, None))
+ raise SaltSystemExit(1, "Master Connection Disabled")
# Run masters discovery over SSDP. This may modify the whole configuration,
# depending of the networking and sets of masters.
As far as I can figure, I don't think master_type: disable
was actually ever tested - at least not thoroughly. The previous line raise salt.ext.tornado.gen.Return((None, None))
has literally no effect, given that the function is always returned as a coroutine generator, which itself lives inside an infinite loop.
I'm looking a little bit more to see what test cases might look like for a PR, but I'm afraid I might have to build out a non-trivial amount of code to test if any Salt daemon should self-exit, given a certain configuration (such as this).
This patch also may not be viable for a scenario such as:
- A master is configured
- The minion has a valid connection (Approved keys) to the master
- A user wants to use
salt-call
with a temporary configuration ofmaster_type: disable
, via a Saltfile for example, (instead of, for some reason, using--local
)
I had similar issue, except I was not even using use_master_when_local
at all. Plus, the error message was completely wrong, leading me to spend hours and days on troubleshooting this.
My config:
---
master: salt-master
file_client: local
...
Salt was saying
Error while bringing up minion for multi-master. Is master at salt-master responding?
So I spent ages troubleshooting network, permissions, SELinux, etc etc.
Only when looking in the DEBUG logs did I see that salt was actually trying to connect to
Master URI: tcp://127.0.0.1:4506
The same happened when I put a direct IP in for the master address. I then discovered that the file_client:
key was causing salt minion to look for master locally.
Hopefully this can be fixed, at least the log output so future troubleshooters are not so mislead as I was.