salt
salt copied to clipboard
[BUG] [3007] Salt-master doesn't start when ssl is enabled
Description Salt-master doesn't start when ssl is enabled. I'm using default configuration file with lines 483-486 uncommented. I've tested multiple certificates, self signed ones and also generated using Lets Encrypt.
ssl:
keyfile: /etc/salt/pki/wildcard.key
certfile: /etc/salt/pki/wildcard.crt
ssl_version: PROTOCOL_TLSv1_2
When starting service with such configuration I'm getting below error:
2024-05-22 15:06:30,405 [salt._logging.impl:1085][ERROR ][177436] An un-handled exception was caught by Salt's global exception handler:
TypeError: PublishServer.__init__() got an unexpected keyword argument 'ssl'
Traceback (most recent call last):
File "/usr/bin/salt-master", line 11, in <module>
sys.exit(salt_master())
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/scripts.py", line 88, in salt_master
master.start()
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/cli/daemons.py", line 224, in start
self.master.start()
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/master.py", line 814, in start
chan = salt.channel.server.PubServerChannel.factory(opts)
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/channel/server.py", line 748, in factory
transport = salt.transport.publish_server(opts, **kwargs)
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/transport/base.py", line 105, in publish_server
return salt.transport.zeromq.PublishServer(opts, **kwargs)
TypeError: PublishServer.__init__() got an unexpected keyword argument 'ssl'
I've tested same configuration with salt 3006.8 and it just worked.
Setup New Centos 9 VM with salt-master 3007 installed (onedir installation) and SSL enabled.
Steps to Reproduce the behavior Basic Centos 9 Stream VM with salt-master 3007 installed.
Expected behavior salt-master should just work.
Versions Report
salt --versions-report
[root@salt-master-6 ~]# salt --versions-report
Salt Version:
Salt: 3007.0
Python Version:
Python: 3.10.13 (main, Feb 19 2024, 03:31:20) [GCC 11.2.0]
Dependency Versions:
cffi: 1.16.0
cherrypy: unknown
dateutil: 2.8.2
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
Jinja2: 3.1.3
libgit2: Not Installed
looseversion: 1.3.0
M2Crypto: Not Installed
Mako: Not Installed
msgpack: 1.0.7
msgpack-pure: Not Installed
mysql-python: Not Installed
packaging: 23.1
pycparser: 2.21
pycrypto: Not Installed
pycryptodome: 3.19.1
pygit2: Not Installed
python-gnupg: 0.5.2
PyYAML: 6.0.1
PyZMQ: 25.1.2
relenv: 0.15.1
smmap: Not Installed
timelib: 0.3.0
Tornado: 6.3.3
ZMQ: 4.3.4
Salt Package Information:
Package Type: onedir
System Versions:
dist: centos 9
locale: utf-8
machine: x86_64
release: 5.14.0-370.el9.x86_64
system: Linux
version: CentOS Stream 9
Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey. Please be sure to review our Code of Conduct. Also, check out some of our community resources including:
- Community Wiki
- Salt’s Contributor Guide
- Join our Community Slack
- IRC on LiberaChat
- Salt Project YouTube channel
- Salt Project Twitch channel
There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar. If you have additional questions, email us at [email protected]. We’re glad you’ve joined our community and look forward to doing awesome things with you!
We upgraded to 3007.1 and had to comment out the configuration to get the masters to start and this is a problem for us, so we'll be rolling back.
You can workaround the above error by only specifying ssl settings for an ssl approved transport (tcp or websocket). Zeromq doesn't use ssl (as far as I'm aware). If you're only using zeromq then I'm pretty sure the ssl config is not doing anything.
transport_opts:
tcp:
ssl:
keyfile: /etc/salt/pki/wildcard.key
certfile: /etc/salt/pki/wildcard.crt
ssl_version: PROTOCOL_TLSv1_2
However, that just creates a new error...
Traceback (most recent call last):
File "/opt/saltstack/salt/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/process.py", line 995, in wrapped_run_func
return run_func()
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/master.py", line 1317, in run
self.__bind()
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/master.py", line 1166, in __bind
req_channel.post_fork(
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/channel/server.py", line 123, in post_fork
self.transport.post_fork(self.handle_message, io_loop)
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/transport/tcp.py", line 560, in post_fork
ctx = salt.transport.base.ssl_context(self.ssl, server_side=True)
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/transport/base.py", line 454, in ssl_context
context.protocol = ssl_options.get("ssl_version", default_version)
AttributeError: can't set attribute 'protocol'
Turns out context.protocol is a read only attribute. Was SSL even tested? How does /that/ even happen?
I commented out that line, which just causes ssl to use the default protocol, which is fine in 99% of cases. And guess what, I get another error.
Traceback (most recent call last):
File "/opt/saltstack/salt/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/process.py", line 995, in wrapped_run_func
return run_func()
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/master.py", line 1317, in run
self.__bind()
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/master.py", line 1166, in __bind
req_channel.post_fork(
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/channel/server.py", line 123, in post_fork
self.transport.post_fork(self.handle_message, io_loop)
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/transport/tcp.py", line 560, in post_fork
ctx = salt.transport.base.ssl_context(self.ssl, server_side=True)
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/transport/base.py", line 459, in ssl_context
log.error('REQS {0}'.format(ssl_options['cert_reqs']))
KeyError: 'cert_reqs'
Ok fine. I didn't define cert_reqs in the SSL config. That's easy enough.
transport_opts:
tcp:
ssl:
keyfile: /etc/salt/pki/wildcard.key
certfile: /etc/salt/pki/wildcard.crt
cert_reqs: CERT_REQUIRED
It'll work this time, right? No...
Traceback (most recent call last):
File "/opt/saltstack/salt/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/process.py", line 995, in wrapped_run_func
return run_func()
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/master.py", line 1317, in run
self.__bind()
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/master.py", line 1166, in __bind
req_channel.post_fork(
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/channel/server.py", line 123, in post_fork
self.transport.post_fork(self.handle_message, io_loop)
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/transport/tcp.py", line 560, in post_fork
ctx = salt.transport.base.ssl_context(self.ssl, server_side=True)
File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/transport/base.py", line 461, in ssl_context
if ssl_options["cert_reqs"].upper() == "CERT_NONE":
AttributeError: 'VerifyMode' object has no attribute 'upper'
It seems at this point in the code "cert_req" has already been translated to the VerifyMode enum and is no longer a string, yet the code is treating it like one. I fixed that up using the following patch.
--- base.py.orig 2024-07-25 15:14:49.660670232 -0400
+++ base.py 2024-07-25 15:23:29.345236633 -0400
@@ -451,17 +451,17 @@
# Use create_default_context to start with what Python considers resonably
# secure settings.
context = ssl.create_default_context(purpose)
- context.protocol = ssl_options.get("ssl_version", default_version)
+ #context.protocol = ssl_options.get("ssl_version", default_version)
if "certfile" in ssl_options:
context.load_cert_chain(
ssl_options["certfile"], ssl_options.get("keyfile", None)
)
if "cert_reqs" in ssl_options:
- if ssl_options["cert_reqs"].upper() == "CERT_NONE":
+ if ssl_options["cert_reqs"] == ssl.VerifyMode.CERT_NONE:
# This may have been set automatically by PROTOCOL_TLS_CLIENT but is
# incompatible with CERT_NONE so we must manually clear it.
context.check_hostname = False
- context.verify_mode = getattr(ssl.VerifyMode, ssl_options["cert_reqs"])
+ context.verify_mode = ssl_options["cert_reqs"]
if "ca_certs" in ssl_options:
context.load_verify_locations(ssl_options["ca_certs"])
if "verify_locations" in ssl_options:
And then finally, as if all my troubles should be rewarded, I get spammed with some PubClient messages. Maybe they don't mean anything, but they didn't happen in 3006.x.
[DEBUG ] PubClient connecting to <salt.transport.tcp.PublishClient object at 0x7f35d6f5c3a0> '/var/run/salt/master/master_event_pub.ipc'
[DEBUG ] tcp stream to closed, unable to recv
[DEBUG ] Subscriber at connected
[DEBUG ] PubClient conencted to <salt.transport.tcp.PublishClient object at 0x7f35d6f5c3a0> '/var/run/salt/master/master_event_pub.ipc'
[DEBUG ] PubClient connecting to <salt.transport.tcp.PublishClient object at 0x7f35d6f5c3a0> '/var/run/salt/master/master_event_pub.ipc'
[DEBUG ] tcp stream to closed, unable to recv
[DEBUG ] Subscriber at connected
[DEBUG ] PubClient conencted to <salt.transport.tcp.PublishClient object at 0x7f35d6f5c3a0> '/var/run/salt/master/master_event_pub.ipc'