salt icon indicating copy to clipboard operation
salt copied to clipboard

[BUG] Salt master-initiated jobs no longer work after master upgraded to Fedora 39/Python 3.12

Open limburgher opened this issue 1 year ago • 33 comments

Description Running Fedora 38 master with a Fedora 39 minion and a Debian 11 minion. Everything worked perfectly until I upgraded the master to Fedora 39. No sls or config changes took place.

Setup fails even for test.ping

Please be as specific as possible and give set-up details.

  • [X] on-prem machine
  • [ ] VM (Virtualbox, KVM, etc. please specify)
  • [ ] VM running on a cloud service, please be explicit and add details
  • [ ] container (Kubernetes, Docker, containerd, etc. please specify)
  • [ ] or a combination, please be explicit
  • [ ] jails if it is FreeBSD
  • [X] classic packaging
  • [ ] onedir packaging
  • [ ] used bootstrap to install

Steps to Reproduce the behavior Run state.apply or test.ping from the master against any minion

Expected behavior The action succeeds.

Screenshots N/A

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)
Salt Version:
          Salt: 3006.3
 
Python Version:
        Python: 3.12.0 (main, Oct  2 2023, 00:00:00) [GCC 13.2.1 20230918 (Red Hat 13.2.1-3)]
 
Dependency Versions:
          cffi: 1.15.1
      cherrypy: 18.8.0
      dateutil: 2.8.2
     docker-py: Not Installed
         gitdb: 4.0.10
     gitpython: 3.1.32
        Jinja2: 3.1.2
       libgit2: 1.7.1
  looseversion: 1.3.0
      M2Crypto: Not Installed
          Mako: 1.2.3
       msgpack: 1.0.5
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 23.1
     pycparser: 2.20
      pycrypto: 2.6.1
  pycryptodome: 3.19.0
        pygit2: 1.13.0
  python-gnupg: 0.5.0
        PyYAML: 6.0.1
         PyZMQ: 25.1.0
        relenv: Not Installed
         smmap: 5.0.0
       timelib: Not Installed
       Tornado: 6.3.3
           ZMQ: 4.3.4
 
System Versions:
          dist: fedora 39 
        locale: utf-8
       machine: x86_64
       release: 6.5.5-300.fc39.x86_64
        system: Linux
       version: Fedora Linux 39 

Additional context I've checked firewall ports, tcp connectivity, DNS, and rekeyed the master and minions. Nothing helps.

limburgher avatar Oct 06 '23 15:10 limburgher

Additionally, salt-call works from minions on all sls I've tested.

limburgher avatar Oct 06 '23 15:10 limburgher

One exception: salt-call failed to add an apply job to the scheduler from an sls.

limburgher avatar Oct 06 '23 19:10 limburgher

This is because, I bet:

https://github.com/saltstack/salt/commit/4baea1a97be0389fabe5307d084579134a1f9b7a

this didn't make it in to 3006.3. As per my comment on the commit (I've duplicated this on Void Linux), vendored tornado used an obsolete check for python version. Upstream tornado no longer does.

This first if condition below is the root of the problem. The match_hostname function was deprecated in 3.7, and removed in 3.12.

if hasattr(ssl, 'match_hostname') and hasattr(ssl, 'CertificateError'):  # python 3.2+
    ssl_match_hostname = ssl.match_hostname
    SSLCertificateError = ssl.CertificateError
elif ssl is None:
    ssl_match_hostname = SSLCertificateError = None  # type: ignore
else:
    import backports.ssl_match_hostname
    ssl_match_hostname = backports.ssl_match_hostname.match_hostname
    SSLCertificateError = backports.ssl_match_hostname.CertificateError  # type: ignore

Vaelatern avatar Oct 18 '23 05:10 Vaelatern

An error stacktrace for the curious

Traceback (most recent call last):
  File "/usr/bin/salt-call", line 33, in <module>
    sys.exit(load_entry_point('salt==3006.3', 'console_scripts', 'salt-call')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/salt/scripts.py", line 437, in salt_call
    import salt.cli.call
  File "/usr/lib/python3.12/site-packages/salt/cli/call.py", line 3, in <module>
    import salt.cli.caller
  File "/usr/lib/python3.12/site-packages/salt/cli/caller.py", line 12, in <module>
    import salt.channel.client
  File "/usr/lib/python3.12/site-packages/salt/channel/client.py", line 13, in <module>
    import salt.crypt
  File "/usr/lib/python3.12/site-packages/salt/crypt.py", line 26, in <module>
    import salt.payload
  File "/usr/lib/python3.12/site-packages/salt/payload.py", line 12, in <module>
    import salt.loader.context
  File "/usr/lib/python3.12/site-packages/salt/loader/__init__.py", line 23, in <module>
    import salt.utils.event
  File "/usr/lib/python3.12/site-packages/salt/utils/event.py", line 67, in <module>
    import salt.ext.tornado.iostream
  File "/usr/lib/python3.12/site-packages/salt/ext/tornado/iostream.py", line 41, in <module>
    from salt.ext.tornado.netutil import ssl_wrap_socket, ssl_match_hostname, SSLCertificateErr
or, _client_ssl_defaults, _server_ssl_defaults
  File "/usr/lib/python3.12/site-packages/salt/ext/tornado/netutil.py", line 57, in <module>
    import backports.ssl_match_hostname

Note well how using upstream tornado would fix this.

Vaelatern avatar Oct 18 '23 05:10 Vaelatern

We currently carry a patch to help with that: https://src.fedoraproject.org/rpms/salt/blob/rawhide/f/match_hostname.patch

The commit you referenced doesn't apply cleanly to 3006.3, which doesn't surprise me too much as it's rather major.

Do you know when we might expect a release with vendored tornado removed?

limburgher avatar Oct 18 '23 15:10 limburgher

I would expect upstream tornado to land in 3007. I would expect 3007 to land in the next 10 days or so.

Vaelatern avatar Oct 18 '23 17:10 Vaelatern

Ok, good. I'd love a patch that fixed this for 3006.3, but given that timeline, if it is indeed in 3007, I won't lose too much sleep over it.

limburgher avatar Oct 18 '23 17:10 limburgher

I am not affiliated with the salt project, I've only gotten minor patches in before.

Vaelatern avatar Oct 18 '23 17:10 Vaelatern

Has anyone heard anything on the timeline for 3007?

limburgher avatar Nov 14 '23 22:11 limburgher

Nope. I keep reloading https://docs.saltproject.io/salt/install-guide/en/latest/topics/salt-version-support-lifecycle.html hoping to see news.

Vaelatern avatar Nov 15 '23 03:11 Vaelatern

Just in case someone stumbles upon this issue and thinks changing the Fedora 39 salt package's USE_VENDORED_TORNADO = True to USE_VENDORED_TORNADO = False is the obvious workaround: That did not work for me.

ndim avatar Nov 16 '23 21:11 ndim

Just in case someone stumbles upon this issue and thinks changing the Fedora 39 salt package's USE_VENDORED_TORNADO = True to USE_VENDORED_TORNADO = False is the obvious workaround: That did not work for me.

Indeed, the results of that are...bad(tm).

limburgher avatar Nov 16 '23 21:11 limburgher

Salt is not yet ported to support Python 3.12, nor even 3.11 (ran into it being slower than Py 3.10 and have yet to find out why). From the versions report and the classic packaging tickbox, presume you are using the version of Salt provided by Fedora, don't see a Python 3.10.13 in the output which would indicate the OneDir architecture with built-in Python 3.10.

Getting Salt on Python 3.12 is on the list, but small team, and juggling too many balls.
Recent unexpected CVEs for 3005 and 3006 being a bump in the road.

Hope than happy to take a PR porting Salt to use Python 3.12 from the community.

dmurphy18 avatar Nov 17 '23 22:11 dmurphy18

Correct. I'm one of the Fedora packagers of salt.

limburgher avatar Nov 17 '23 22:11 limburgher

Yeah, I used to package Salt for Fedora, on the Core Team, and quite understand how Fedora finds the OneDir architecture unacceptable, but got to do something to support RedHat 7 :laughing: .

Trying to get to Python 3.12, just too much going on at the moment, and wondering where things will be once the dust settles. at least won't have to worry about modules anymore :smiley:

dmurphy18 avatar Nov 17 '23 22:11 dmurphy18

Since 3.12 is now GA and the system Python on Fedora 39, I suppose I could choose a commit just after the tornado change and see if that works...then I could adapt anything additional needed for 3.12 and make a PR.

limburgher avatar Nov 17 '23 22:11 limburgher

If I can find that commit...

limburgher avatar Nov 17 '23 22:11 limburgher

I know that I'm using a version of salt with Py3.12 with only minor patches that I think are all upstreamed. Errors printed? Yes. Works? Also yes.

Vaelatern avatar Nov 18 '23 23:11 Vaelatern

Would you be willing to attach those patches?

limburgher avatar Nov 20 '23 17:11 limburgher

https://github.com/void-linux/void-packages/tree/master/srcpkgs/salt/patches @limburgher

Vaelatern avatar Nov 20 '23 17:11 Vaelatern

Ah, yes, those are some of the patches we carry in Fedora. I see the problems I have with those applied.

limburgher avatar Nov 20 '23 17:11 limburgher

This workaround worked for me (but I cannot guarantee its security) on that Fedora 39 target:

# pip3 install backports.ssl_match_hostname

And this could be a fix from another place in Salt (file salt/utils/thin.py)…

https://github.com/saltstack/salt/blob/1e0c9d71c8ba8fa4f001a44809f012e06fabbdd4/salt/utils/thin.py#L78-L86

…for someone interested in backporting downstream to 3006.4 prior to a new release of Salt.

Related issues:

  • #33470
  • #41020
  • #48802
  • #52453
  • #64304
  • https://bugs.gentoo.org/919160

CC @chutz @limburgher

hartwork avatar Dec 04 '23 17:12 hartwork

I have a working patch for the hostname portion; what I need might be fixed by 64304 but it's doesn't apply to 3006.4. Hoping 3007 is soon...

limburgher avatar Dec 07 '23 17:12 limburgher

@limburgher I believe releases are not made from master here (I could be wrong) but are you sure that merged #64304 will even be included in the next release?

hartwork avatar Dec 07 '23 18:12 hartwork

It is merged into the master branch, so it should appear in 3007 RC1 which is planned to be released before the end of the month.

dmurphy18 avatar Dec 07 '23 18:12 dmurphy18

New behaviour in 3006.5:

There is no current event loop in thread 'Thread-2 (_target)'.

limburgher avatar Dec 14 '23 15:12 limburgher

@limburgher could you turn that into a new issue with a traceback?

hartwork avatar Dec 14 '23 15:12 hartwork

Salt is not yet ported to support Python 3.12, nor even 3.11 (ran into it being slower than Py 3.10 and have yet to find out why). From the versions report and the classic packaging tickbox, presume you are using the version of Salt provided by Fedora, don't see a Python 3.10.13 in the output which would indicate the OneDir architecture with built-in Python 3.10.

Getting Salt on Python 3.12 is on the list, but small team, and juggling too many balls. Recent unexpected CVEs for 3005 and 3006 being a bump in the road.

I've been without Salt now since the upgrade to Fedora 39. I use fedora repositories and avoid outside packaging. if the salt team is so small that it can't keep up with typical updates to python then i think i need to look for a better supported replacement. it seems python deprecation warnings were ignored in older versions till the feature was fully depreciated in 3.12, i.e. #66042 . So unless this is addressed sooner than later i will be looking for a salt replacement with better support.

mepreston avatar Feb 18 '24 19:02 mepreston

@mepreston Salt team doesn't produce the Fedora packages anymore since the move to using the 'onedir' architecture. Given Salt is supposed to be community driven, it would be great to get a contribution with tests which solves the issue with Python 3.12 support for which you are experiencing.

dmurphy18 avatar Feb 21 '24 17:02 dmurphy18

It looks like the Salt master is going to be dropped from Fedora 39.

DemiMarie avatar May 05 '24 16:05 DemiMarie