salt icon indicating copy to clipboard operation
salt copied to clipboard

[BUG] 3002.2 minions only send static grains to master for pillar compilation

Open jgraichen opened this issue 4 years ago • 7 comments

Description

All state.apply commands send via master to a newly started minion fail, because the pillars for the minion cannot compiled if any grains are used. Only after running saltutil.refresh_pillar the pillar compilation starts working. This only happens when the minion has static grains configure in the minion configuration, e.g. /etc/salt/minion.

The error is the same as in #59294, a KeyError is raised when grains is accessed. In our case this happens inside a custom ext_pillar:

2021-02-18 23:30:43,655 [salt.pillar      :1153][ERROR   ][26125] Exception caught loading ext_pillar 'tower':
  File "/usr/lib/python3/dist-packages/salt/pillar/__init__.py", line 1145, in ext_pillar
    ext = self._external_pillar_data(pillar, val, key)
  File "/usr/lib/python3/dist-packages/salt/pillar/__init__.py", line 1085, in _external_pillar_data
    ext = self.ext_pillars[key](self.minion_id, pillar, val)
  File "/var/cache/salt/master/extmods/pillar/tower.py", line 56, in ext_pillar
    tower.run(top)
  File "/var/cache/salt/master/extmods/pillar/tower.py", line 119, in run
    self._load_item(base, item)
  File "/var/cache/salt/master/extmods/pillar/tower.py", line 167, in _load_item
    self.load(item, base)
  File "/var/cache/salt/master/extmods/pillar/tower.py", line 195, in load
    self._load_file(file, base)
  File "/var/cache/salt/master/extmods/pillar/tower.py", line 220, in _load_file
    data = self._compile(file, context={"basedir": base})
  File "/var/cache/salt/master/extmods/pillar/tower.py", line 289, in _compile
    raise err
  File "/var/cache/salt/master/extmods/pillar/tower.py", line 284, in _compile
    **kwargs,
  File "/usr/lib/python3/dist-packages/salt/template.py", line 101, in compile_template
    ret = render(input_data, saltenv, sls, **render_kwargs)
  File "/usr/lib/python3/dist-packages/salt/renderers/jinja.py", line 79, in render
    **kws
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 260, in render_tmpl
    output = render_str(tmplstr, context, tmplpath)
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 505, in render_jinja_tmpl
    raise SaltRenderError("Jinja variable {}{}".format(exc, out), buf=tmplstr)

2021-02-18 23:30:43,655 [salt.pillar      :1208][CRITICAL][26125] Pillar render error: Failed to load ext_pillar tower: Jinja variable 'dict object' has no attribute 'fqdn'

Setup

Our minions automatically run state.apply and start and started to fail with the above KeyError.

Logging the __grains__ dict that is passed to the pillar, showed that only grains from the minion config are included:

logging.debug(__grains__)
# 2021-02-18 23:34:15,835 [root             :50  ][ERROR   ][23103] {'roles': ['rabbitmq'], 'site_key': 'secret', 'id': 'minion_id'}

Our minions are configured with static grains in /etc/salt/minion:

---
grains:
    roles:
    - rabbitmq
    site_key: secret
id: minion_id
master: salt-master
rejected_retry: true

With static grains configured, I was able to reproduce the issue with any master-triggered state.apply run:

# restart salt-minion
$ systemctl restart salt-minion

# run state.apply from salt-master
$ salt minion_id state.apply
minion_id:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Failed to load ext_pillar tower: Jinja variable 'dict object' has no attribute 'fqdn'

After running saltutil.refresh_pillar it started working correctly:

$ salt minion_id saltutil.refresh_pillar
minion_id:
    True
$ salt minion_id state.apply
minion_id:
  [...]
-------------
Succeeded: 44 (changed=4)
Failed:     0
-------------
Total states run:     44
Total run time:    2.338 s

Further tests have shown a very interesting behavior:

A minion without grains defined in minion config did collect all grains and passed them to the master for pillar compilation.

Expected behavior

They minion should collect all grains locally and send them to the master for pillar compilation.

Versions Report

salt --versions-report
Salt Version:
          Salt: 3002.2
 
Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.6.1
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.10
       libgit2: 0.26.0
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 0.5.6
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: 2.6.1
  pycryptodome: 3.4.7
        pygit2: 0.26.2
        Python: 3.6.9 (default, Oct  8 2020, 12:12:24)
  python-gnupg: 0.4.1
        PyYAML: 3.12
         PyZMQ: 17.1.2
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.2.5
 
System Versions:
          dist: ubuntu 18.04 Bionic Beaver
        locale: UTF-8
       machine: x86_64
       release: 4.15.0-1084-kvm
        system: Linux
       version: Ubuntu 18.04 Bionic Beaver

jgraichen avatar Feb 18 '21 22:02 jgraichen

Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey. Please be sure to review our Code of Conduct. Also, check out some of our community resources including:

There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar. If you have additional questions, email us at [email protected]. We’re glad you’ve joined our community and look forward to doing awesome things with you!

welcome[bot] avatar Feb 18 '21 22:02 welcome[bot]

It appears that minions only using /etc/salt/grains for custom grains are not affected too.

We are currently using this as a workaround by adjusting our cloud-config for bootstrapping and by fixing or rebuilding all existing minion machines.

#cloud-config
salt_minion:
    conf:
        master: salt.example.com
    grains: # *not* part of `conf` above
        role:
            - web

jgraichen avatar Feb 19 '21 10:02 jgraichen

I think I am seeing the same issue on 3002.6. Using jinja syntax grains['mygrain'] fails until pillar_refresh, but salt['grains.get']('mygrain') finds the grain. My situation is using salt-cloud to provision to a vcenter and then run state.apply.

vthyng avatar Apr 15 '21 18:04 vthyng

I'm not able to find the bug report now, but I wonder if this is related to another issue we've seen in 3002 where only configured grains are being loaded? If the user runs saltutil.refresh_grains after the minion has been restarted then the rest of the core grains finish loading.

@jgraichen Can you try a command like salt-call saltutil.refresh_grains and see if that helps?

doesitblend avatar May 01 '21 01:05 doesitblend

@doesitblend I do not have any affected minion anymore, but as far as I remember issuing some calls, such as saltutil.refresh_grains and saltutil.refresh_pillar did work, at least when being done manually via cli, or with some delay. Some initial tests having saltutil.refresh_pillar invoked directly on minion start (startup commands, reactor) did not work.

jgraichen avatar May 12 '21 12:05 jgraichen

@doesitblend From some plugin and extensions development I have done, I would suspect some change using __opts__["grains"] as a stateful cache for loaded grain data, which would break when there is a grains key in the config, unless explicitly refreshed, but I haven't investigated anything there.

jgraichen avatar May 12 '21 12:05 jgraichen

This looks like a possible duplicate of https://github.com/saltstack/salt/issues/60123

Can I close this issue in favor of that one, since its assigned out and scheduled for the 3006 release?

Ch3LL avatar Aug 09 '22 19:08 Ch3LL

Closing this one out in favor of #60123

garethgreenaway avatar Aug 23 '22 17:08 garethgreenaway