ansible-junos-stdlib
ansible-junos-stdlib copied to clipboard
juniper_junos_software: Unable to Validate that NSSU Works
- Bug Report
- Feature Idea
Module Name
juniper_junos_software
Juniper.Junos role and Python libraries version
$ ansible --version
ansible 2.7.10
config file = /opt/ansible/ansible.cfg
configured module search path = [u'/home/luca/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /opt/ansible/ansible-venv/local/lib/python2.7/site-packages/ansible
executable location = /opt/ansible/ansible-venv/bin/ansible
python version = 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609]
ansible==2.7.10
ansible-netbox-inventory==1.0.9
asn1crypto==0.24.0
bcrypt==3.1.6
certifi==2019.3.9
cffi==1.12.2
chardet==3.0.4
cryptography==2.6.1
enum34==1.1.6
idna==2.8
ipaddress==1.0.22
Jinja2==2.10.1
junos-eznc==2.2.0
jxmlease==1.0.1
lxml==4.3.3
MarkupSafe==1.1.1
ncclient==0.6.4
netaddr==0.7.19
paramiko==2.4.2
pkg-resources==0.0.0
pyasn1==0.4.5
pycparser==2.19
PyNaCl==1.3.0
pyserial==3.4
PyYAML==5.1
requests==2.21.0
scp==0.13.2
selectors2==2.0.1
six==1.12.0
urllib3==1.24.1
OS / Environment
QFX5100-48S Virtual Chassis 17.4R1 (2 members)
Summary
when running NSSU the backup RE is upgraded first, then the RE is flipped to the newly upgraded device so the other member can be upgraded. When the RE flip happens the netconf session is broken (same with SSH if you happen to be on the command line)
This means the ansible never gets the message that a reboot has been initiated, such as: Package /opt/ansible/software/jinstall-host-qfx-5-17.4R1.16-signed.tgz successfully installed. Reboot successfully initiated."
Therefore Ansible will eventually error out when when the RPC timer expires.
Steps to reproduce
Basically just run any upgrade using nssu using juniper_junos_software
- name: Install Junos OS package QFX5K
juniper_junos_software:
#version: "17.4R2-S2.3"
cleanfs: no
local_package: "/opt/ansible/software/jinstall-host-qfx-5-17.4R1.16-signed.tgz"
remote_package: "/var/tmp/jinstall-host-qfx-5-17.4R1.16-signed.tgz"
nssu: yes
checksum:
reboot: true
validate: false
force_host: yes
logfile: /opt/ansible/logs/{{ inventory_hostname }}-logs.log
user: "{{ username }}"
passwd: "{{ password }}"
register: sw
- name: Check Status
debug:
var: sw
Expected results
Not really sure if there is a workaround here... be great if we could reconnect after the RE flip to confirm a reboot has been initiated
Actual results
The error is:
TimeoutExpiredError('ncclient timed out while waiting for an rpc reply.')\nncclient.operations.errors.TimeoutExpiredError: ncclient timed out while waiting for an rpc reply.
Once the netconf session dies, it never sees that the last node reboots so never moves on
@lucasalvatore1 I have not worked on NSSU. Let me take a look at how NSSU is handled on the device and using PyEZ, then we can discuss further what can be added/modified to make the module better.
thank you very much @rsmekala