salt icon indicating copy to clipboard operation
salt copied to clipboard

[BUG] state.apply fails if pillar uses custom grain from _grains

Open bdrx312 opened this issue 1 year ago • 16 comments

Description

state.apply fails and does not sync custom grains and modules scripts from /srv/salt/_grains and and /srv/salt/_modules directories if a pillar uses the custom grain or custom execution module.

Setup

Please be as specific as possible and give set-up details.

  • [ ] VM (KVM) running on AWS ec2 instance.
  • [ ] onedir packaging
  • [ ] masterless

Steps to Reproduce the behavior

  • Setup directories and files; set minion to masterless; create a custom grain script and a pillar that uses custom grain returned from the script:

    cat > /etc/salt/minion <<'EOF'
    file_client: local
    master_type: disable
    EOF
    
    mkdir /srv/salt /srv/pillar /srv/salt/_grains
    cat > /srv/salt/top.sls <<'EOF'
    base:
      '*':
        - test
    EOF
    
    cat > /srv/salt/test.sls <<'EOF'
    "do nothing":
      test.nop: []
    EOF
    
    cat > /srv/salt/_grains/custom_grain.py <<'EOF'
    def main():
        return {'custom_grain': 'test_value'}
    EOF
    
    cat > /srv/pillar/top.sls <<'EOF'
    base:
      '*':
        - defaults
    EOF
    
    cat > /srv/pillar/defaults.sls <<'EOF'
    mypillar: "{{ grains['custom_grain'] }}"
    EOF
    
    
  • Run state.apply which results in an error

salt-call --local state.apply
salt.exceptions.SaltRenderError: Jinja variable 'dict object' has no attribute 'custom_grain'; line 1

---
mypillar: "{{ grains['custom_grain'] }}"    <======================

---
[CRITICAL] Pillar render error: Rendering SLS 'defaults' failed. Please see master log for details.
local:
    Data failed to compile:
--------
    Pillar failed to render with the following messages:
--------
    Rendering SLS 'defaults' failed. Please see master log for details.

Expected behavior The custom _grains and _modules should be synced before rendering the pillars and state.apply should run successfully to completion.

Screenshots If applicable, add screenshots to help explain your problem.

Versions Report

salt --versions-report
Salt Version:
    Salt: 3006.2

Python Version:
    Python: 3.10.12 (main, Aug 3 2023, 21:47:10) [GCC 11.2.0]

Dependency Versions:
    cffi: 1.14.6
    cherrypy: 18.6.1
    dateutil: 2.8.2
    Jinja2: 3.1.2
    msgpack: 1.0.2
    packaging: 22.0
    pycparser: 2.21
    pycryptodome: 3.9.8
    python-gnupg: 0.4.8
    PyYAML: 6.0.1
    PyZMQ: 23.2.0
    relenv: 0.13.3
    timelib: 0.2.4
    Tornado: 4.5.3
    ZMQ: 4.3.4

System Versions:
    dist: rhel 8.8 Ootpa
    locale: utf-8
    machine: x86_64
    release: 4.18.0-477.13.1.el8_8.x86_64
    system: Linux
    version: Red Hat Enterprise Linux 8.8 Ootpa

Additional context Running salt-call --local saltutil.sync_all --pillar-root=/dev/null before running state.apply syncs the _grains and _modules correctly and allows the state.apply to run correctly.

bdrx312 avatar Aug 22 '23 16:08 bdrx312

@bdrx312 Please retry the issue with the latest 3006.2, a number of fixes have been made since 3006.0 was released.

dmurphy18 avatar Aug 22 '23 16:08 dmurphy18

@bdrx312 Please retry the issue with the latest 3006.2, a number of fixes have been made since 3006.0 was released.

I did not realize that 3006.2 was out already. I updated and tried again but got the same error. I will edit the post to put the update version info.

bdrx312 avatar Aug 22 '23 17:08 bdrx312

@bdrx312 Tried this with Salt 3005.1 classic packaging and it failed, wondering if there is something missing in the instructions

local:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Specified SLS 'defaults' in environment 'base' is not available on the salt master

Trying to ensure it used to work and now broken

dmurphy18 avatar Aug 25 '23 22:08 dmurphy18

Also from a salt-master

root@Unknown:/srv/pillar# salt td11 pillar.items
td11:
    ----------
    _errors:
        - Specified SLS 'defaults' in environment 'base' is not available on the salt master
root@Unknown:/srv/pillar#```

The salt-master

root@Unknown:/srv/pillar# l total 8.0K drwxr-xr-x. 4 root root 30 Aug 3 2020 .. -rw-r--r--. 1 root root 28 Aug 25 16:23 top.sls -rw-r--r--. 1 root root 41 Aug 25 16:23 default.sls drwxr-xr-x. 2 root root 88 Aug 25 16:25 arch drwxr-xr-x. 3 root root 49 Aug 25 16:25 . root@Unknown:/srv/pillar# root@Unknown:/srv/pillar# cat top.sls base: '*': - defaults root@Unknown:/srv/pillar# cat default.sls mypillar: "{{ grains['custom_grain'] }}" root@Unknown:/srv/pillar#

dmurphy18 avatar Aug 25 '23 22:08 dmurphy18

@bdrx312 Tried this with Salt 3005.1 classic packaging and it failed, wondering if there is something missing in the instructions

I added the salt minion configuration /etc/salt/minion to set it to masterless mode. I think all that is needed is the master_type: disable, and I went ahead and added file_client: local. I will have to check at work on Monday to see if any other settings are needed, but I believe that should be all to get it working.

bdrx312 avatar Aug 26 '23 17:08 bdrx312

@bdrx312 Even in masterless with file_client local, I am still unale to reproduce this with Salt 3006.1, and do not see a SaltRenderError in the logs:

[DEBUG   ] Gathering pillar data for state run
[DEBUG   ] Finished gathering pillar data for state run
[INFO    ] Loading fresh modules for state activity
[DEBUG   ] The functions from module 'jinja' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded jinja.render
[DEBUG   ] The functions from module 'yaml' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded yaml.render
[DEBUG   ] The functions from module 'highstate' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded highstate.output
local:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Specified SLS 'defaults' in environment 'base' is not available on the salt master
root@tdeb11:/srv/pillar# 

Can you reexamine the configurations on the minion in order to reproduce the issue you are experiencing ?, but so far I have been unable to reproduce the rendering error.

dmurphy18 avatar Sep 05 '23 16:09 dmurphy18

@bdrx312 Even in masterless with file_client local, I am still unale to reproduce this with Salt 3006.1, and do not see a SaltRenderError in the logs:

[DEBUG   ] Gathering pillar data for state run
[DEBUG   ] Finished gathering pillar data for state run
[INFO    ] Loading fresh modules for state activity
[DEBUG   ] The functions from module 'jinja' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded jinja.render
[DEBUG   ] The functions from module 'yaml' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded yaml.render
[DEBUG   ] The functions from module 'highstate' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded highstate.output
local:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Specified SLS 'defaults' in environment 'base' is not available on the salt master
root@tdeb11:/srv/pillar# 

Can you reexamine the configurations on the minion in order to reproduce the issue you are experiencing ?, but so far I have been unable to reproduce the rendering error.

I had a typo/mismatch in the file name of the pillar defaults file. In the pillar top file I specified defaults, but in the file creation it was just /srv/pillar/default.yml (no s). I have corrected the original post to add the s making it /srv/pillar/defaults.yml.

bdrx312 avatar Sep 05 '23 18:09 bdrx312

So with the default.sls -> defaults.sls, and re-run salt-call --local state.apply It appears to work for me with Salt 3006.1

[DEBUG   ] File /var/cache/salt/minion/accumulator/139644020863808 does not exist, no need to cleanup
[DEBUG   ] The functions from module 'state' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded state.check_result
[DEBUG   ] The functions from module 'highstate' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded highstate.output
local:
----------
          ID: do nothing
    Function: test.nop
      Result: True
     Comment: Success!
     Started: 12:52:45.047935
    Duration: 1.222 ms
     Changes:   

Summary for local
------------
Succeeded: 1
Failed:    0
------------
Total states run:     1
Total run time:   1.222 ms
root@tdeb11:/srv/pillar# l
total 16K
drwxr-xr-x 2 root root 4.0K Sep  5 12:52 .
drwxr-xr-x 4 root root 4.0K Aug 25 16:14 ..
-rw-r--r-- 1 root root   41 Aug 25 16:14 defaults.sls
-rw-r--r-- 1 root root   28 Aug 25 16:14 top.sls
root@tdeb11:/srv/pillar#

Update to Salt 3006.2 and same result, it works for me with Salt 3006.2 too From salt-call --local grains.items

    cpuarch:
        x86_64
    custom_grain:
        test_value

dmurphy18 avatar Sep 05 '23 18:09 dmurphy18

I just tried also on a fresh bento/ubuntu-22.04 vagrant vm and experienced the same behavior. Try manually clearing your cache with salt-call --local saltutil.clear_cache and then re-run the state.apply

root@salt-test-box:~# salt-call --local state.apply
[ERROR   ] Rendering exception occurred
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 502, in render_jinja_tmpl
    output = template.render(**decoded_context)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 1291, in render
    self.environment.handle_exception()
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 925, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 1, in top-level template code
jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'custom_grain'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 261, in render_tmpl
    output = render_str(tmplstr, context, tmplpath)
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 509, in render_jinja_tmpl
    raise SaltRenderError("Jinja variable {}{}".format(exc, out), buf=tmplstr)
salt.exceptions.SaltRenderError: Jinja variable 'dict object' has no attribute 'custom_grain'
[CRITICAL] Rendering SLS 'defaults' failed, render error:
Jinja variable 'dict object' has no attribute 'custom_grain'
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 502, in render_jinja_tmpl
    output = template.render(**decoded_context)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 1291, in render
    self.environment.handle_exception()
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 925, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 1, in top-level template code
jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'custom_grain'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/pillar/__init__.py", line 887, in render_pstate
    state = compile_template(
  File "/usr/lib/python3/dist-packages/salt/template.py", line 99, in compile_template
    ret = render(input_data, saltenv, sls, **render_kwargs)
  File "/usr/lib/python3/dist-packages/salt/loader/lazy.py", line 149, in __call__
    return self.loader.run(run_func, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/salt/loader/lazy.py", line 1201, in run
    return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/salt/loader/lazy.py", line 1216, in _run_as
    return _func_or_method(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/salt/renderers/jinja.py", line 62, in render
    tmp_data = salt.utils.templates.JINJA(
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 261, in render_tmpl
    output = render_str(tmplstr, context, tmplpath)
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 509, in render_jinja_tmpl
    raise SaltRenderError("Jinja variable {}{}".format(exc, out), buf=tmplstr)
salt.exceptions.SaltRenderError: Jinja variable 'dict object' has no attribute 'custom_grain'
[CRITICAL] Pillar render error: Rendering SLS 'defaults' failed. Please see master log for details.
local:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Rendering SLS 'defaults' failed. Please see master log for details.

bdrx312 avatar Sep 06 '23 02:09 bdrx312

@bdrx312 Will try that, but I was using a VirtualBox Debian 11 amd64, from cold start.

And got

local:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Rendering SLS 'defaults' failed. Please see master log for details.
root@tdeb11:/srv/salt#

And the SaltRenderError in the logs. Thanks for the clear_cache Will dig in

dmurphy18 avatar Sep 06 '23 21:09 dmurphy18

Running the saltutil.clear_cache removes the custom grain, however saltutil.sync_grains does return it and all works as it should.

However, the documentation https://docs.saltproject.io/en/latest/topics/grains/index.html#syncing-grains, states that the state.highstate should automatically sync the grains, but that is not happening and getting the same failure as with state.apply. Also the custom grain is not getting rebuilt when the minion is restarted, e.g. systemctl restart salt-minion, as shown it is missing running grains.ls

This problem is only related to _grains, custom grains in /etc/salt/grains appear fine.

dmurphy18 avatar Sep 07 '23 00:09 dmurphy18

This appears to be a corner-case with masterless minion, since with the salt://_grains on a master, the problem does not happen, except after the following commands:

  • salt tc7 saltutil.clear_cache
  • systemctl restart salt-minion
  • salt tc7 grains.ls (custom_grains is not listed)
  • salt tc7 saltutil.refresh_grains (doesn't sync _grains as per doc)
  • salt tc7 grains.ls (custom_grains is not listed)
  • salt tc7 saltutil.sync_grains
  • salt tc7 grains.ls (custom_grains is not listed)
  • custom_grains still not listed after this
  • salt tc7 saltutil.refresh_grains
  • salt tc7 grains.ls
  • salt tc7 saltutil.sync_all
  • salt tc7 grains.ls Even restarting the salt-master is not bringing custom_grains back, so also a hole with a salt-master too

Appears misread the doc and minion restart doesn't sync _grains

dmurphy18 avatar Sep 07 '23 16:09 dmurphy18

Well the issue appears to be that the rendering error is encountered loading up the grains and pillar before we get to execute the call_highstate which will sync_all, and error out due to pillar errors just be we call highstate, see lines https://github.com/saltstack/salt/blob/master/salt/modules/state.py#L1173-L1192

Problem is with class Minion and SProxyMinion too, after the classes have loaded the grains (which doesn't load the _grains custom grains), they then immediately compile the pillar which has a file making use of the custom grain from _grains, and then encounter the render error. SProxyMinion method gen_modules, even does a sync_all, but it is too late after the call to compile_pillar which shows the render error.

Got a chicken and egg issue

dmurphy18 avatar Sep 12 '23 22:09 dmurphy18

Have a problem found here , that is chicken and egg https://github.com/saltstack/salt/pull/65186/files/f1de5431aac91d64d321c3ef31a5e970c9fd3ffe by @Ch3LL

In the dunder init for class SMinion the opts["master_uri"] is not filled in till after the call to ioloop.run_sync, see https://github.com/saltstack/salt/blob/master/salt/minion.py#L928-L932

But with the salt.utils.extmods.sync(opts, "grains"), there will be an attempt in AsyncReqChannel to use opts["master_uri"] for the remote client, it is called from SyncWrapper.

dmurphy18 avatar Sep 26 '23 22:09 dmurphy18

Closing since associated PR is merged

dmurphy18 avatar Oct 23 '23 15:10 dmurphy18

Re-opening this and reverting changes in https://github.com/saltstack/salt/pull/65186 since it is an incomplete fix and while merged and released in Salt 3006.5, it is causing problems. See Issue https://github.com/saltstack/salt/issues/65692 and PR https://github.com/saltstack/salt/pull/65738

dmurphy18 avatar Dec 20 '23 21:12 dmurphy18

Is it possible this is around with current 3007.1 as well? I just upgraded to 3007.1 and can reproduce this, while downgrading to e.g. 3006.1 is working fine?

ixs avatar Jul 01 '24 18:07 ixs

The problem will exist on 3007.1 and 3006.8, working on it but higher priority issues taking precedence, have the changes done, but need to write a lot more tests so don't have the problems that I did with the first attempt at fixing this which resulted in having to reverse the change.

dmurphy18 avatar Jul 01 '24 18:07 dmurphy18

PR https://github.com/saltstack/salt/pull/66737 is the rework of work done when accidentally closed https://github.com/saltstack/salt/pull/65792 when accidentally closed the branch that the work was being worked on.

dmurphy18 avatar Jul 23 '24 15:07 dmurphy18