salt icon indicating copy to clipboard operation
salt copied to clipboard

[BUG] refreshing beacons leaks inotify handles

Open golmaal opened this issue 5 years ago • 5 comments

Description Setup an inotify beacon and refresh beacons. Every time beacons are refreshed inotify file handles are leaked.

Setup Start a CentOS:8.1.1911 container curl -o bootstrap-salt.sh -L https://bootstrap.saltstack.com ./bootstrap-salt.sh -P -M git v3002.1 pip3 install pyinotify pytz salt-call beacons.add inotify '{"files":["/tmp"],"mask":"create"}'

Steps to Reproduce the behavior salt-call state.highstate OR salt-call saltutil.refresh_beacons

Check open file handles using lsof - lsof | grep -E 'salt-mi.*inotify'

Expected behavior A constant number of inotify handles should remain open.

[root@test2 salt]# lsof | grep -E 'salt-mi.*inotify' salt-mini 2172 root 27r a_inode 0,14 0 13151 inotify salt-mini 2172 2173 salt-mini root 27r a_inode 0,14 0 13151 inotify salt-mini 2172 2206 salt-mini root 27r a_inode 0,14 0 13151 inotify salt-mini 2172 2207 salt-mini root 27r a_inode 0,14 0 13151 inotify

However, after refreshing beacons, it keeps on increasing -

[root@test2 salt]# salt-call saltutil.refresh_beacons local: True [root@test2 salt]# lsof | grep -E 'salt-mi.*inotify' salt-mini 2172 root 27r a_inode 0,14 0 13151 inotify salt-mini 2172 root 28r a_inode 0,14 0 13151 inotify salt-mini 2172 root 29r a_inode 0,14 0 13151 inotify salt-mini 2172 2173 salt-mini root 27r a_inode 0,14 0 13151 inotify salt-mini 2172 2173 salt-mini root 28r a_inode 0,14 0 13151 inotify salt-mini 2172 2173 salt-mini root 29r a_inode 0,14 0 13151 inotify salt-mini 2172 2206 salt-mini root 27r a_inode 0,14 0 13151 inotify salt-mini 2172 2206 salt-mini root 28r a_inode 0,14 0 13151 inotify salt-mini 2172 2206 salt-mini root 29r a_inode 0,14 0 13151 inotify salt-mini 2172 2207 salt-mini root 27r a_inode 0,14 0 13151 inotify salt-mini 2172 2207 salt-mini root 28r a_inode 0,14 0 13151 inotify salt-mini 2172 2207 salt-mini root 29r a_inode 0,14 0 13151 inotify

[root@test2 salt]# salt --versions-report Salt Version: Salt: 3002.1

Dependency Versions: cffi: Not Installed cherrypy: Not Installed dateutil: 2.6.1 docker-py: Not Installed gitdb: Not Installed gitpython: Not Installed Jinja2: 2.11.2 libgit2: Not Installed M2Crypto: Not Installed Mako: Not Installed msgpack-pure: Not Installed msgpack-python: 1.0.0 mysql-python: Not Installed pycparser: Not Installed pycrypto: Not Installed pycryptodome: 3.9.9 pygit2: Not Installed Python: 3.6.8 (default, Apr 16 2020, 01:36:27) python-gnupg: Not Installed PyYAML: 5.3.1 PyZMQ: 19.0.2 smmap: Not Installed timelib: Not Installed Tornado: 4.5.3 ZMQ: 4.3.2

System Versions: dist: centos 8 Core locale: UTF-8 machine: x86_64 release: 4.18.0-193.19.1.el8_2.x86_64 system: Linux version: CentOS Linux 8 Core

golmaal avatar Nov 09 '20 06:11 golmaal

@golmaal Thanks for the report.

garethgreenaway avatar Nov 19 '20 19:11 garethgreenaway

Hello, I got this after upgrading from 3001.8 to 3004. The problem happens when I have configured beacon and run the highstate.

salt-call -V
Salt Version:
          Salt: 3004

Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.5.3
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.9.4
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 0.6.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: Not Installed
  pycryptodome: 3.6.1
        pygit2: Not Installed
        Python: 3.5.3 (default, Sep 27 2018, 17:25:39)
  python-gnupg: Not Installed
        PyYAML: 3.12
         PyZMQ: 17.1.2
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.2.1

System Versions:
          dist: debian 9 stretch
        locale: UTF-8
       machine: x86_64
       release: 4.9.0-12-amd64
        system: Linux
       version: Debian GNU/Linux 9 stretch

My beacon configuration:

beacons:
    inotify:
        - files:
             /etc/shadow:
                 mask:
                   - modify
        - interval: 300
        - coalesce: True 

cheburazavrik avatar Feb 10 '22 17:02 cheburazavrik

This seems to be a bug of pyinotify: https://github.com/seb-m/pyinotify/issues/192

The original project seems to be unmaintained since ~2015 and while there's a little better maintained fork of PyInotify available, pypi distributes the unmaintained original and at least Ubuntu's package is also based on this one.

I don't really know, whether the fork dsoprea/PyInotify fixes the mentioned issue, so this would have to be investigated/tested.

eliasp avatar May 18 '22 13:05 eliasp

Having the same problem with Salt 3002 on Ubuntu:

salt-call -V
Salt Version:
          Salt: 3002.8
 
Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.7.3
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.10.1
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 0.6.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: 2.6.1
  pycryptodome: 3.6.1
        pygit2: Not Installed
        Python: 3.8.10 (default, Mar 15 2022, 12:22:08)
  python-gnupg: 0.4.5
        PyYAML: 5.3.1
         PyZMQ: 18.1.1
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.2
 
System Versions:
          dist: ubuntu 20.04 focal
        locale: utf-8
       machine: x86_64
       release: 5.4.0-104-generic
        system: Linux
       version: Ubuntu 20.04 focal

Configured a beacon like this:

beacons:
  file_renew:
    - beacon_module: inotify
    - disable_during_state_run: True
    - interval: 5
    - files:
        /apps/test/file1:
          mask:
            - close_write
          recurse: False
          auto_add: False
          coalesce: True
        /apps/test/file2:
          mask:
            - close_write
          recurse: False
          auto_add: False
          coalesce: True

After restart of minion i have 1 inotify-handle. Then, over time, this inotify-handles increase.

After 1 day the minion has about 128 inotify-handles - which is the default-maxium of max_user_instances:

cat /proc/sys/fs/inotify/max_user_instances
128

for i in /proc/*/fd/*; do readlink -f $i; done | grep inotify | sort | uniq -c | sort -nr
    128 /proc/3521660/fd/anon_inode:inotify
      3 /proc/3612872/fd/anon_inode:inotify
      3 /proc/1/fd/anon_inode:inotify
      2 /proc/884/fd/anon_inode:inotify
      2 /proc/866/fd/anon_inode:inotify
      1 /proc/916/fd/anon_inode:inotify
      1 /proc/880/fd/anon_inode:inotify
      1 /proc/853/fd/anon_inode:inotify
      1 /proc/848/fd/anon_inode:inotify
      1 /proc/2824542/fd/anon_inode:inotify
      1 /proc/2824437/fd/anon_inode:inotify
      1 /proc/2824294/fd/anon_inode:inotify

Then this messages appear in the logs until i restart the salt-minion:

May 16 03:45:58 minionid salt-minion[2069191]: [CRITICAL] The beacon errored:
May 16 03:45:58 minionid salt-minion[2069191]: Traceback (most recent call last):
May 16 03:45:58 minionid salt-minion[2069191]:   File "/usr/lib/python3/dist-packages/salt/minion.py", line 2878, in handle_beacons
May 16 03:45:58 minionid salt-minion[2069191]:     beacons = self.process_beacons(self.functions)
May 16 03:45:58 minionid salt-minion[2069191]:   File "/usr/lib/python3/dist-packages/salt/minion.py", line 515, in process_beacons
May 16 03:45:58 minionid salt-minion[2069191]:     return self.beacons.process(
May 16 03:45:58 minionid salt-minion[2069191]:   File "/usr/lib/python3/dist-packages/salt/beacons/__init__.py", line 130, in process
May 16 03:45:58 minionid salt-minion[2069191]:     raw = self.beacons[fun_str](b_config[mod])
May 16 03:45:58 minionid salt-minion[2069191]:   File "/usr/lib/python3/dist-packages/salt/beacons/inotify.py", line 270, in beacon
May 16 03:45:58 minionid salt-minion[2069191]:     notifier = _get_notifier(_config)
May 16 03:45:58 minionid salt-minion[2069191]:   File "/usr/lib/python3/dist-packages/salt/beacons/inotify.py", line 78, in _get_notifier
May 16 03:45:58 minionid salt-minion[2069191]:     wm = pyinotify.WatchManager()
May 16 03:45:58 minionid salt-minion[2069191]:   File "/usr/lib/python3/dist-packages/pyinotify.py", line 1764, in __init__
May 16 03:45:58 minionid salt-minion[2069191]:     raise OSError(err % self._inotify_wrapper.str_errno())
May 16 03:45:58 minionid salt-minion[2069191]: OSError: Cannot initialize new instance of inotify, Errno=Too many open files (EMFILE)

installed pyinotify is:

dpkg -l|grep pyinotify
ii  python-pyinotify                      0.9.6-1.2ubuntu1                   all          simple Linux inotify Python bindings
ii  python3-pyinotify                     0.9.6-1.2ubuntu1                   all          simple Linux inotify Python bindings

rolszewo avatar May 20 '22 05:05 rolszewo

Any updates on this issue? This issue makes beacons effectively unusable. Once the open files is maxed out the server will constantly throw errors and services start to fail. I've already had multiple salt-minion services crash because of this issue causing our operations team some sleepless nights.

mrj1m0thy avatar Jul 28 '22 21:07 mrj1m0thy

Any updates on this issue? At this point, I had to find ugly and quick workarounds to replace beacons. Having this fixed, or at least getting a confirmation that this will not get fixed, I can invest resources to properly replace beacons.

cheburazavrik avatar Nov 15 '22 21:11 cheburazavrik

Any updates yet on this issue? This is still hitting us and we need to workaround the ever growing inotify handle needs of the salt minion

BT-dschleich avatar Apr 17 '23 14:04 BT-dschleich