salt icon indicating copy to clipboard operation
salt copied to clipboard

[BUG] gitfs fails to work after upgraded to 3007.4

Open hailong1004 opened this issue 6 months ago • 14 comments

Description gitfs fails to work after upgraded to 3007.4

Setup (Please provide relevant configs and/or SLS files (be sure to remove sensitive info. There is no general set-up of Salt.)

Please be as specific as possible and give set-up details.

  • [ ] on-prem machine
  • [ ] VM (Virtualbox, KVM, etc. please specify)
  • [ ] VM running on a cloud service, please be explicit and add details
  • [ ] container (Kubernetes, Docker, containerd, etc. please specify)
  • [ ] or a combination, please be explicit
  • [ ] jails if it is FreeBSD
  • [ ] classic packaging
  • [ ] onedir packaging
  • [ ] used bootstrap to install

Steps to Reproduce the behavior salt-master and salt-minion is 3007.3, gitfs work normal. only when I upgrade salt-master and salt-minion from 3007.3 to 3007.4, when minion run salt-call state.sls system-base, get error:

local:
    Data failed to compile:
----------
    No matching sls found for 'system-base' in env 'base'

master error log:

2025-06-13 03:57:51,359 [salt.master      :2109][ERROR   ][232] Error in function _file_hash:
Traceback (most recent call last):
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/master.py", line 2103, in run_func
    ret = getattr(self, func)(load)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/fileserver/__init__.py", line 650, in file_hash
    return self.__file_hash_and_stat(load)[0]
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/fileserver/__init__.py", line 634, in __file_hash_and_stat
    fnd = self.find_file(
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/fileserver/__init__.py", line 592, in find_file
    fnd = self.servers[fstr](path, saltenv, **kwargs)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 160, in __call__
    ret = self.loader.run(run_func, *args, **kwargs)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1269, in run
    return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1284, in _run_as
    ret = _func_or_method(*args, **kwargs)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/fileserver/gitfs.py", line 160, in find_file
    return _gitfs().find_file(path, tgt_env=tgt_env, **kwargs)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/gitfs.py", line 3286, in find_file
    dest = salt.utils.verify.clean_join(self.cache_root, "refs", tgt_env, path)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/verify.py", line 567, in clean_join
    raise SaltValidationError(f"Invalid path: {path!r}")
salt.exceptions.SaltValidationError: Invalid path: 'system-base/init.sls'
2025-06-13 03:57:51,368 [salt.master      :2109][ERROR   ][241] Error in function _file_find:
Traceback (most recent call last):
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/master.py", line 2103, in run_func
    ret = getattr(self, func)(load)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/fileserver/__init__.py", line 540, in _find_file
    return self.find_file(path, tgt_env)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/fileserver/__init__.py", line 592, in find_file
    fnd = self.servers[fstr](path, saltenv, **kwargs)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 160, in __call__
    ret = self.loader.run(run_func, *args, **kwargs)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1269, in run
    return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1284, in _run_as
    ret = _func_or_method(*args, **kwargs)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/fileserver/gitfs.py", line 160, in find_file
    return _gitfs().find_file(path, tgt_env=tgt_env, **kwargs)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/gitfs.py", line 3286, in find_file
    dest = salt.utils.verify.clean_join(self.cache_root, "refs", tgt_env, path)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/verify.py", line 567, in clean_join
    raise SaltValidationError(f"Invalid path: {path!r}")
salt.exceptions.SaltValidationError: Invalid path: 'system-base/init.sls'

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)
Salt Version:
          Salt: 3007.4

Python Version:
        Python: 3.10.17 (main, Jun  9 2025, 20:41:48) [GCC 11.2.0]

Dependency Versions:
          cffi: 1.16.0
      cherrypy: unknown
  cryptography: 42.0.5
      dateutil: 2.8.2
     docker-py: 1.10.6
         gitdb: 4.0.12
     gitpython: Not Installed
        Jinja2: 3.1.6
       libgit2: 1.7.2
  looseversion: 1.3.0
      M2Crypto: 0.45.1
          Mako: 1.3.10
       msgpack: 1.0.7
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 24.0
     pycparser: 2.21
      pycrypto: 2.6.1
  pycryptodome: 3.19.1
        pygit2: 1.14.1
  python-gnupg: 0.5.2
        PyYAML: 6.0.1
         PyZMQ: 25.1.2
        relenv: 0.19.3
         smmap: 5.0.2
       timelib: 0.3.0
       Tornado: 6.4.2
           ZMQ: 4.3.4

Salt Package Information:
  Package Type: onedir

System Versions:
          dist: ubuntu 22.04.5 jammy
        locale: utf-8
       machine: x86_64
       release: 5.15.0-122-generic
        system: Linux
       version: Ubuntu 22.04.5 jammy

Additional context Add any other context about the problem here.

hailong1004 avatar Jun 13 '25 04:06 hailong1004

Also encountering this; possibly related to f7c28ffbf18dbf693a15b1ba9493918de3e88cf3?

wyattanderson avatar Jun 13 '25 04:06 wyattanderson

Within salt/utils/gitfs.py find_file(), reverting from salt.utils.verify.clean_join to salt.utils.path.join fixes this this for me on 3007.4.

Regression does appear to have been introduced by https://github.com/saltstack/salt/commit/f7c28ffbf18dbf693a15b1ba9493918de3e88cf3

verify.clean_join appears to check that the file already exists in the cache directory before it is retrieved on-demand by gitfs?

rmounce avatar Jun 13 '25 05:06 rmounce

Same regression in 3006.12. Do you seriously still not do any QA for gitfs?

OrangeDog avatar Jun 13 '25 08:06 OrangeDog

Also experiencing this on 3007.4. Downgrading the master to 3007.3 does fix the issue, but it seems the 3007.4 minions are entirely incompatible with the 3007.3 master. Is there any viable workaround other than downgrading the master and all minions?

thomas-pike avatar Jun 13 '25 09:06 thomas-pike

Because this release is not backwards compatible if minions already have been upgraded, everything broke in my setup with no clean downgrade path not involving a lot of manual work, so I manually edited /opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/gitfs.py replacing verify.clean_join with path.join in find_file function of GitFS class (starts at line 3276) on my salt master and restarted the salt master. This file will be overwritten on the next upgrade, but hopefully this includes a fix.

hkbakke avatar Jun 13 '25 10:06 hkbakke

Same regression in 3006.12. Do you seriously still not do any QA for gitfs?

We rely on our test suite. Every bug gets a regression test. So, when these issues get fixed there will be tests added to cover the cases.

dwoz avatar Jun 13 '25 21:06 dwoz

not sure if this is related... just tossing a note in here on a HACK i did as a workaround for _pygit2.GitError: error loading known_hosts:

/opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/gitfs.py
def _fetch(self):
  ... os.environ['HOME']='/opt/saltstack/salt'

this was needed because it seems that when the salt-master changes to the salt user the HOME env var didn't make it's way to pygit2 so it could find /opt/saltstack/salt/.ssh/known_hosts

after that I got the raise SaltValidationError(f"Invalid path: {path!r}") issue and reverting from salt.utils.verify.clean_join to salt.utils.path.join got me up and running.

steverweber avatar Jun 14 '25 04:06 steverweber

Will this fix also works for gitfs and https as repo?

teankie avatar Jun 14 '25 12:06 teankie

Will this fix also works for gitfs and https as repo?

Yes, I was using HTTPS when I found that workaround.

rmounce avatar Jun 14 '25 12:06 rmounce

quick fix for people:

Note. only for v3006.x servers.

cd /opt/saltstack/salt/lib/python3.10/site-packages/salt/utils
mv gitfs.py gitfs.py.orig
wget https://raw.githubusercontent.com/saltstack/salt/1d4c12ec3b61e717653556c3aa25f0acfd4ad167/salt/utils/gitfs.py
service salt-master restart

Note. only for v3007.x servers.

cd /opt/saltstack/salt/lib/python3.10/site-packages/salt/utils
mv gitfs.py gitfs.py.orig
wget https://raw.githubusercontent.com/saltstack/salt/c695e0bcff8e19787c37d569d2d1afa375565f3f/salt/utils/gitfs.py
service salt-master restart

ITJamie avatar Jun 16 '25 09:06 ITJamie

Same regression in 3006.12. Do you seriously still not do any QA for gitfs?

We rely on our test suite. Every bug gets a regression test. So, when these issues get fixed there will be tests added to cover >the cases.

So that's a "no" then. Individual unit regression tests are clearly insufficient, as this keeps happening. At minimum, whoever is working on fixing security issues in a subsystem should actually run Salt at some point and not just blindly commit stuff.

I do not blindly commit stuff, nor do I make changes of any kind security related or otherwise without testing the changes.


@dwoz are you aware you edited my comment rather than replying to it?

I fail to see how you could have failed to notice that change made gitfs entirely non-functional if you did in fact test it.

But it is not the fault of one person - it is the fault of the entire team and the system that allowed this sort of bug to be released, again and again and again.

OrangeDog avatar Jun 16 '25 10:06 OrangeDog

quick fix for people:

cd /opt/saltstack/salt/lib/python3.10/site-packages/salt/utils
mv gitfs.py gitfs.py.orig
wget https://raw.githubusercontent.com/saltstack/salt/1d4c12ec3b61e717653556c3aa25f0acfd4ad167/salt/utils/gitfs.py
service salt-master restart

This doesn't work for me, I get this error:

salt-master[613794]:   File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/gitfs.py", line 25, in <module>
salt-master[613794]:     import salt.ext.tornado.ioloop
salt-master[613794]: ModuleNotFoundError: No module named 'salt.ext.tornado'

While solution from @hkbakke works fine.

--- gitfs.py.orig	2025-06-16 10:20:26.885799211 +0000
+++ gitfs.py	2025-06-16 10:21:50.825308196 +0000
@@ -3283,14 +3283,14 @@
             return fnd

         # dest = salt.utils.path.join(self.cache_root, "refs", tgt_env, path)
-        dest = salt.utils.verify.clean_join(self.cache_root, "refs", tgt_env, path)
-        hashes_glob = salt.utils.verify.clean_join(
+        dest = salt.utils.path.join(self.cache_root, "refs", tgt_env, path)
+        hashes_glob = salt.utils.path.join(
             self.hash_cachedir, tgt_env, f"{path}.hash.*"
         )
-        blobshadest = salt.utils.verify.clean_join(
+        blobshadest = salt.utils.path.join(
             self.hash_cachedir, tgt_env, f"{path}.hash.blob_sha1"
         )
-        lk_fn = salt.utils.verify.clean_join(self.hash_cachedir, tgt_env, f"{path}.lk")
+        lk_fn = salt.utils.path.join(self.hash_cachedir, tgt_env, f"{path}.lk")
         destdir = os.path.dirname(dest)
         hashdir = os.path.dirname(blobshadest)
         if not os.path.isdir(destdir):

mdzidic avatar Jun 16 '25 10:06 mdzidic

Because they gave a URL for a specific commit. Instead, use the tag of the previous version of the branch that you are using.

OrangeDog avatar Jun 16 '25 10:06 OrangeDog

ived updated my comment. its only for people on 3006 master. would need a different url for the 3007 users

ITJamie avatar Jun 16 '25 10:06 ITJamie

same issue. fixed by reverting to salt.utils.path.join

kritzi-at avatar Jun 17 '25 21:06 kritzi-at

Using @dwos patch to allow subdirs fixed the errors for me

djivey avatar Jun 18 '25 19:06 djivey

I fail to see how you could have failed to notice that change made gitfs entirely non-functional if you did in fact test it.

But it is not the fault of one person - it is the fault of the entire team and the system that allowed this sort of bug to be released, again and again and again.

You fail to see how. That does not mean I do not test changes.

dwoz avatar Jun 20 '25 21:06 dwoz

I just faced the same issue. As a quick workaround, I commented out the raise SaltValidationError(f"Invalid path: {path!r}") at /opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/verify.py line 567 until the fixes are released.

Alternatively to gitfs, there is an option to clone the repo

mkdir -p /srv/formulas
cd /srv/formulas/
git clone repo

and configure file_roots to point to a directory with the repository, explicitly ensuring all necessary sources are in place and ready to be used by Salt.

file_roots:
  base:
    - /srv/salt
    - /srv/formulas/repo

xeagle2 avatar Jun 22 '25 11:06 xeagle2

@dwoz I have some sympathy for @OrangeDog 's reaction here ... git_pillar was semi-broken for nearly 18 months on branch names that contained a / (#65467), and fixing that involved a cycle of being told the "problem could not be replicated", until eventually an explicit solution was provided and merged into 3006.10.

Now, just two point releases later (3006.12) and there is another break in gitfs, parsing the sort of urls that are in the GitFS Walkthrough, and used in git clone commands all the time (#68069), and then this one then on top of that. Working with gitfs hosted states and pillars sometimes feels like a minefield :-)

donkopotamus avatar Jun 23 '25 02:06 donkopotamus

@donkopotamus don't forget when #66590 was "fixed" in 3006.9 and still did not work because of a second breaking change in the same pygit2 version (#67017), proving that it was never tested.

OrangeDog avatar Jun 23 '25 08:06 OrangeDog

@twangboy it looks like this is fixed in 3006.13. Is there a reason you moved it to .14?

OrangeDog avatar Jun 27 '25 09:06 OrangeDog

Just wanted to verify it was fixed before closing.

twangboy avatar Jun 27 '25 19:06 twangboy

Just wanted to verify it was fixed

The way to do that would be to test it before releasing it...

OrangeDog avatar Jun 28 '25 09:06 OrangeDog