napalm-salt
napalm-salt copied to clipboard
Grains missing/lost when keep_alive is False and connection is closed.
Opening this as I believe it to be a bug/undesirable state.
When refresh grains or any sync activity on the proxy (saltutil.sync_all
) grains populated by Napalm are reset to 'None' when the connection to the device has closed. This occurs when the proxy setting keep_alive:False
is applied.
Instead, what should happen is if a grains refresh is requested, the connection should be restarted and grains re-collected.
proxy.sls:
proxy:
global_delay_factor: 5
proxytype: napalm
driver: ios
always_alive: False
Workflow:
/run # salt -I 'realm:WSC' grains.get model
c40988df-1a86-4ff0-bf8d-e018cc6c55bd:
model:
C891-24X/K9
/run # salt 'c40988df-1a86-4ff0-bf8d-e018cc6c55bd' saltutil.refresh_grains
c40988df-1a86-4ff0-bf8d-e018cc6c55bd:
True
/run # salt 'c40988df-1a86-4ff0-bf8d-e018cc6c55bd' grains.get model
c40988df-1a86-4ff0-bf8d-e018cc6c55bd:
None
Proxy log:
[ERROR ] Cannot execute "cli" on <redacted> as Salt. Reason: Socket is closed!
[ERROR ] Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/napalm/ios/ios.py", line 205, in _send_command
output = self.device.send_command(command)
File "/usr/lib/python3.9/site-packages/netmiko/utilities.py", line 500, in wrapper_decorator
return func(self, *args, **kwargs)
File "/usr/lib/python3.9/site-packages/netmiko/base_connection.py", line 1475, in send_command
prompt = self.find_prompt(delay_factor=delay_factor)
File "/usr/lib/python3.9/site-packages/netmiko/base_connection.py", line 1179, in find_prompt
self.write_channel(self.RETURN)
File "/usr/lib/python3.9/site-packages/netmiko/base_connection.py", line 459, in write_channel
self._write_channel(out_data)
File "/usr/lib/python3.9/site-packages/netmiko/base_connection.py", line 417, in _write_channel
self.remote_conn.sendall(write_bytes(out_data, encoding=self.encoding))
File "/usr/lib/python3.9/site-packages/paramiko/channel.py", line 846, in sendall
sent = self.send(s)
File "/usr/lib/python3.9/site-packages/paramiko/channel.py", line 801, in send
return self._send(s, m)
File "/usr/lib/python3.9/site-packages/paramiko/channel.py", line 1198, in _send
raise socket.error("Socket is closed")
OSError: Socket is closed
Please note I applied the fix for the bug noted here: https://github.com/saltstack/salt/issues/60025
Versions Report:
/run # salt-proxy --versions-report
Salt Version:
Salt: 3003
Dependency Versions:
cffi: 1.14.5
cherrypy: 18.6.1
dateutil: 2.8.1
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
Jinja2: 3.0.1
libgit2: 1.1.0
M2Crypto: Not Installed
Mako: Not Installed
msgpack: 1.0.2
msgpack-pure: Not Installed
mysql-python: Not Installed
pycparser: 2.20
pycrypto: 3.10.1
pycryptodome: 3.10.1
pygit2: 1.4.0
Python: 3.9.5 (default, May 12 2021, 20:44:22)
python-gnupg: Not Installed
PyYAML: 5.4.1
PyZMQ: 19.0.2
smmap: Not Installed
timelib: Not Installed
Tornado: 4.5.3
ZMQ: 4.3.4
System Versions:
dist: alpine 3.14.0
locale: utf-8
machine: x86_64
release: 5.4.72-microsoft-standard-WSL2
system: Linux
version: Alpine Linux 3.14.0
Note: This is a lab setup but the same is seen in our production setup with the same version but running on CentOS.
Many thanks!
P.S - the reason for us enabling keep_alive:False
is due to a Cisco CVE where the implementation of SSH Global Requests is causing a FSM exception and crashing devices.
I believe I have found the issue in salt.utilts.napalm package..
I can do a PR to the SaltStack repo if thats the right place for it?
>>> import napalm.base as napalm_base
>>> from napalm_base.exceptions import ConnectionClosedException
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'napalm_base'
>>>
>>>
>>>
>>> from napalm.base.exceptions import ConnectionClosedException
It appears naplam_base is never loaded (correctly?) so the HAS_CONN_CLOSED_EXC_CLASS
which causes this reconnect clause to be ignored in the call()
function:
elif retry and HAS_CONN_CLOSED_EXC_CLASS and isinstance(error, ConnectionClosedException):
Tested my manually importing:
from napalm.base.exceptions import ConnectionClosedException
and its working well now:
/run # salt -I 'realm:WSC' grains.get model
c40988df-1a86-4ff0-bf8d-e018cc6c55bd:
C891-24X/K9
-------------------------------------------
Summary
-------------------------------------------
# of minions targeted: 1
# of minions returned: 1
# of minions that did not return: 0
# of minions with errors: 0
-------------------------------------------
/run # salt -I 'realm:WSC' saltutil.refresh_grains
c40988df-1a86-4ff0-bf8d-e018cc6c55bd:
True
-------------------------------------------
Summary
-------------------------------------------
# of minions targeted: 1
# of minions returned: 1
# of minions that did not return: 0
# of minions with errors: 0
-------------------------------------------
/run # salt -I 'realm:WSC' grains.get model
c40988df-1a86-4ff0-bf8d-e018cc6c55bd:
C891-24X/K9
-------------------------------------------
Summary
-------------------------------------------
# of minions targeted: 1
# of minions returned: 1
# of minions that did not return: 0
# of minions with errors: 0
-------------------------------------------
/run #
I created a related bug in the SaltStack project as I guess that is the right place for it: https://github.com/saltstack/salt/issues/60581 Feel free to close this one if needed otherwise I will on its resolution in the SaltStack repo.
I'm aiming on completing a PR for this.