cisco.nxos icon indicating copy to clipboard operation
cisco.nxos copied to clipboard

Unable to respond to interactive prompt command

Open colinet opened this issue 1 year ago • 19 comments

SUMMARY

While running below playbook, it fails :

- name: switch_fc | helpers | session_reset - perform reset
  cisco.nxos.nxos_command:
    commands:
      - command: clear zone lock vsan 1014
        prompt: 'Do you want to continue'
        answer: 'y'
```

I get the below error: 

```
TASK [ds-role-san_CRUD : switch_fc | helpers | session_reset - perform reset] **************************************************************************************************************
Friday 06 October 2023  14:29:08 +0200 (0:00:00.057)       0:00:03.903 ******** 
failed: [localhost] (item={'name': 'fabric_a', 'switch_fabric': 'xxxxxxxxxxxx', 'vsan_id': 1014}) => {"ansible_loop_var": "fab", "changed": false, "fab": {"name": "fabric_a", "switch_fabric": "xxxxxxxxxxxx", "vsan_id": 1014}, "module_stderr": "command timeout triggered, timeout value is 30 secs.\nSee the timeout setting options in the Network Debug and Troubleshooting Guide.", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error"}
```

##### ISSUE TYPE
I used same example as exposed in the documentation:
https://docs.ansible.com/ansible/latest/collections/cisco/nxos/nxos_command_module.html

##### COMPONENT NAME
nxos

##### ANSIBLE VERSION
```paste below
[xxxxxxxxxx@xxxxxxxxxxxxxxx~]$  ansible --version
ansible [core 2.15.0]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/xxxxxxxx/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.9/site-packages/ansible
  ansible collection location = /home/xxxxxxxxxxxx/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/local/bin/ansible
  python version = 3.9.16 (main, May 29 2023, 00:00:00) [GCC 11.3.1 20221121 (Red Hat 11.3.1-4)] (/usr/bin/python)
  jinja version = 3.1.2
  libyaml = True
[xxxxxxxxxxxx@xxxxxxxxxxxxxxxxx~]$
```

##### COLLECTION VERSION

```paste below
[xxxxxxxx@xxxxxxxxxxxxxxx~]$  ansible-galaxy collection list |grep pure
purestorage.flasharray        1.21.0 
purestorage.flasharray        1.18.0 
```

##### CONFIGURATION
<!--- Paste verbatim output from "ansible-config dump --only-changed" between quotes -->
```paste below
[xxxxxx@xxxxxxxxxxxx~]$  ansible-config dump --only-changed
CACHE_PLUGIN(/etc/ansible/ansible.cfg) = memory
CALLBACKS_ENABLED(/etc/ansible/ansible.cfg) = ['ansible.posix.profile_tasks']
CONFIG_FILE() = /etc/ansible/ansible.cfg
DEFAULT_FORKS(/etc/ansible/ansible.cfg) = 5
DEFAULT_GATHERING(/etc/ansible/ansible.cfg) = implicit
DEFAULT_HOST_LIST(/etc/ansible/ansible.cfg) = ['/etc/ansible/hosts']
DEFAULT_MANAGED_STR(/etc/ansible/ansible.cfg) = # WARNING: This script is managed by Ansible with The Linux Framework. Any manual changes will be lost the next time Ansible runs.
DEFAULT_POLL_INTERVAL(/etc/ansible/ansible.cfg) = 15
DEFAULT_ROLES_PATH(/etc/ansible/ansible.cfg) = ['/home/xxxxxxxx/workspace/ansible/ds-roles']
DEFAULT_TRANSPORT(/etc/ansible/ansible.cfg) = smart
DEFAULT_VAULT_PASSWORD_FILE(/etc/ansible/ansible.cfg) = /home/svc_worker/.vps.txt
DISPLAY_SKIPPED_HOSTS(/etc/ansible/ansible.cfg) = True
HOST_KEY_CHECKING(/etc/ansible/ansible.cfg) = False
PERSISTENT_CONNECT_RETRY_TIMEOUT(/etc/ansible/ansible.cfg) = 30
PERSISTENT_CONNECT_TIMEOUT(/etc/ansible/ansible.cfg) = 60
RETRY_FILES_ENABLED(/etc/ansible/ansible.cfg) = False
[xxxxxxx@xxxxxxxxxxxxx~]$ 
```

##### OS / ENVIRONMENT
Redhat 9.0

##### EXPECTED RESULTS
This should clear zone lock.

colinet avatar Oct 06 '23 12:10 colinet

@colinet Is the target device Cisco MDS?

NilashishC avatar Oct 06 '23 12:10 NilashishC

Yes, it is for an MDS switch.

I've tried different syntaxes. But no way. I wonder whether the example exposed in the documentation https://docs.ansible.com/ansible/latest/collections/cisco/nxos/nxos_command_module.html is valid.

colinet avatar Oct 06 '23 13:10 colinet

@colinet As mentioned in the Notes section of docs, this module only has limited support for Cisco MDS switches and hence, might not fully work right out of the box, as it would for Nexus.

@srbharadwaj Would you be able to look into this?

NilashishC avatar Oct 09 '23 07:10 NilashishC

@NilashishC is the option 'prompt' a valid one? i don't see that is the documentation.. and i also see that commented out in the code https://github.com/ansible-collections/cisco.nxos/blob/1fd405b383827716ef8f3c8c7eabe9d2e317d61d/plugins/modules/nxos_command.py#L176

srbharadwaj avatar Oct 09 '23 07:10 srbharadwaj

@srbharadwaj The prompt option is valid. Since commands can be of at least two forms - (a) a list of strings (commands to send), (b) a list of dictionary (command + prompt + answer combination), it's element type is set to raw in argspec. The prompt handling logic is implemented in the cliconf plugin and in the network_cli connection plugin code.

https://github.com/ansible-collections/cisco.nxos/blob/main/plugins/cliconf/nxos.py#L240-L248 https://github.com/ansible-collections/ansible.netcommon/blob/main/plugins/connection/network_cli.py#L1059

NilashishC avatar Oct 09 '23 07:10 NilashishC

@colinet Could you please share the device interaction logs for this scenario?

Steps: https://docs.ansible.com/ansible/latest/network/user_guide/network_debug_troubleshooting.html#enabling-networking-device-interaction-logging

NilashishC avatar Oct 09 '23 07:10 NilashishC

ok can we know where this was tested?

On Mon, 9 Oct 2023 at 13:15, Nilashish Chakraborty @.***> wrote:

@srbharadwaj https://github.com/srbharadwaj The prompt option is valid. Since commands can be at least two forms - (a) a list of strings (commands to send), (b) a list of dictionary (command + prompt + answer combination), it's element type is set to raw in argspec. The prompt handling logic is implemented in the cliconf plugin and in the network_cli connection plugin code.

https://github.com/ansible-collections/cisco.nxos/blob/main/plugins/cliconf/nxos.py#L240-L248

https://github.com/ansible-collections/ansible.netcommon/blob/main/plugins/connection/network_cli.py#L1059

— Reply to this email directly, view it on GitHub https://github.com/ansible-collections/cisco.nxos/issues/769#issuecomment-1752495190, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXMEFJ4OG7XZBQ3Q5TU5A3X6OTTJAVCNFSM6AAAAAA5VZAMZ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJSGQ4TKMJZGA . You are receiving this because you were mentioned.Message ID: @.***>

srbharadwaj avatar Oct 09 '23 07:10 srbharadwaj

@srbharadwaj The following task is tested to be working with Nexus 9300v (NX-OS 9.3.6):

    - name: Switch to maintenance mode
      cisco.nxos.nxos_command:
        commands:
          - configure terminal
          - command: system mode maintenance
            prompt: Do you want to continue
            answer: y

NilashishC avatar Oct 09 '23 08:10 NilashishC

@colinet You can temporarily turn off cli confirmation prompts before you run the clear command as a workaround. Have you tried that?

- name: switch_fc | helpers | session_reset - perform reset
  cisco.nxos.nxos_command:
    commands:
      - terminal dont-ask
      - clear zone lock vsan 1014

NilashishC avatar Oct 09 '23 08:10 NilashishC

The solution works on one fabric but surprisingly failed on second fabric with unexpected result:

The playbook is now:


- name: switch_fc | helpers | session_reset - perform reset
  cisco.nxos.nxos_command:
    commands:
      - terminal dont-ask
      - clear device-alias session
      - "clear zone lock vsan {{ fab.vsan_id }}"
  vars:
    ansible_connection: "{{ san_CRUD_switch_fabric_api }}"
    ansible_network_os: "{{ san_CRUD_switch_fabric_os }}"
    ansible_user: "{{ san_CRUD_switch_fabric_svc_user }}"
    ansible_password: "{{ san_CRUD_switch_fabric_svc_password }}"
    ansible_host: "{{ fab.switch_fabric }}"
    ansible_httpapi_port: "{{ san_CRUD_switch_fabric_port }}"
    ansible_httpapi_use_ssl: true
    ansible_httpapi_validate_certs: false
  loop: "{{ reset_data }}"
  loop_control:
    loop_var: fab

The outcome is:

TASK [ds-role-san_CRUD : switch_fc | helpers | session_reset - perform reset] *********************************************************************************************************************************************************************************
Monday 09 October 2023  16:58:10 +0200 (0:00:00.055)       0:00:02.221 ******** 
ok: [localhost] => (item={'name': 'fabric_a', 'switch_fabric': 'switch_001', 'vsan_id': 1014})
failed: [localhost] (item={'name': 'fabric_b', 'switch_fabric': 'switch_002', 'vsan_id': 2014}) => {"ansible_loop_var": "fab", "changed": false, "fab": {"name": "fabric_b", "switch_fabric": "switch_002", "vsan_id": 2014}, "module_stderr": "clear zone lock vsan 2014: CLI execution error: Command will clear lock from the entire fabric only if issued on initiating switch.\nElse lock will be cleared only locally.\nVSAN 2014 is not active\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error"}

So it tells me on second fabric that VSAN 2014 is not active which is wrong. If I run the command directly on the switch of the second fabric, it is successfull.

colinet avatar Oct 09 '23 15:10 colinet

Hi Remi, can you check the accounting logs when the failure occurred?

On Mon, 9 Oct 2023 at 20:31, Remi Colinet @.***> wrote:

The solution works on one fabric but surprisingly failed on second fabric with unexpected result:

TASK [ds-role-san_CRUD : switch_fc | helpers | session_reset - perform reset] ********************************************************************************************************************************************************************************* Monday 09 October 2023 16:58:10 +0200 (0:00:00.055) 0:00:02.221 ******** ok: [localhost] => (item={'name': 'fabric_a', 'switch_fabric': 'swich_001', 'vsan_id': 1014}) failed: [localhost] (item={'name': 'fabric_b', 'switch_fabric': 'swich_002', 'vsan_id': 2014}) => {"ansible_loop_var": "fab", "changed": false, "fab": {"name": "fabric_b", "switch_fabric": "swich_002", "vsan_id": 2014}, "module_stderr": "clear zone lock vsan 2014: CLI execution error: Command will clear lock from the entire fabric only if issued on initiating switch.\nElse lock will be cleared only locally.\nVSAN 2014 is not active\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error"}

So it tells me on second fabric that VSAN 2014 is not active which is wrong. If I run the command directly on the switch of the second fabric, it is successfull.

— Reply to this email directly, view it on GitHub https://github.com/ansible-collections/cisco.nxos/issues/769#issuecomment-1753179442, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXMEFM22ZV6FGSO532STVLX6QGU5AVCNFSM6AAAAAA5VZAMZ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJTGE3TSNBUGI . You are receiving this because you were mentioned.Message ID: @.***>

srbharadwaj avatar Oct 09 '23 16:10 srbharadwaj

@colinet Just for my understanding, which connection type are you using to run the original task with the prompt and answer?

NilashishC avatar Oct 10 '23 05:10 NilashishC

check the accounting logs when the failure occurred

I'am getting the following accounting log just after having run the playbook on the 1st switch of the second fabric:

Tue Oct 10 12:43:10 2023:type=stop:id=10.80.144.120@pts/0:user=dcnmuser:cmd=shell terminated because the ssh session closed
Tue Oct 10 12:43:10 2023:type=start:id=10.80.144.120@pts/7:user=dcnmuser:cmd=
Tue Oct 10 12:43:10 2023:type=update:id=10.80.144.120@pts/7:user=dcnmuser:cmd=terminal session-timeout 60 (SUCCESS)
Tue Oct 10 12:43:10 2023:type=update:id=10.80.144.120@pts/7:user=dcnmuser:cmd=terminal length 0 (SUCCESS)
Tue Oct 10 12:43:11 2023:type=stop:id=10.80.144.120@pts/7:user=dcnmuser:cmd=shell terminated because the ssh session closed
Tue Oct 10 12:43:11 2023:type=start:id=10.80.144.120@pts/0:user=dcnmuser:cmd=
Tue Oct 10 12:43:11 2023:type=update:id=10.80.144.120@pts/0:user=dcnmuser:cmd=terminal session-timeout 60 (SUCCESS)
Tue Oct 10 12:43:12 2023:type=update:id=10.80.144.120@pts/0:user=dcnmuser:cmd=terminal length 0 (SUCCESS)

colinet avatar Oct 10 '23 12:10 colinet

@colinet Just for my understanding, which connection type are you using to run the original task with the prompt and answer?

I'am using API connexion type. For the above playbook, I have: san_CRUD_switch_fabric_api: ansible.netcommon.httpapi

colinet avatar Oct 10 '23 12:10 colinet

I run the playbook with -vvv. The outcome is

TASK [ds-role-san_CRUD : switch_fc | helpers | session_reset - perform reset] *********************************************************************************************************************************************************************************************
task path: /home/xxxxxxx/workspace/ansible/ds-roles/ds-role-san_CRUD/tasks/switch_fc/helpers/session_reset.yml:20
Tuesday 10 October 2023  14:57:05 +0200 (0:00:00.048)       0:00:02.201 ******* 
redirecting (type: action) cisco.nxos.nxos_command to cisco.nxos.nxos
redirecting (type: action) cisco.nxos.nxos_command to cisco.nxos.nxos
ok: [localhost] => (item={'name': 'fabric_a', 'switch_fabric': 'mhxcissan000sas', 'vsan_id': 1014}) => {
    "ansible_loop_var": "fab",
    "changed": false,
    "fab": {
        "name": "fabric_a",
        "switch_fabric": "mhxcissan000sas",
        "vsan_id": 1014
    },
    "invocation": {
        "module_args": {
            "commands": [
                "terminal dont-ask",
                "clear zone lock vsan 1014"
            ],
            "interval": 1,
            "match": "all",
            "retries": 9,
            "wait_for": null
        }
    },
    "stdout": [
        {},
        "Command will clear lock from the entire fabric only if issued on initiating switch.\nElse lock will be cleared only locally.\nNo pending info found"
    ],
    "stdout_lines": [
        {},
        [
            "Command will clear lock from the entire fabric only if issued on initiating switch.",
            "Else lock will be cleared only locally.",
            "No pending info found"
        ]
    ]
}
redirecting (type: action) cisco.nxos.nxos_command to cisco.nxos.nxos
redirecting (type: action) cisco.nxos.nxos_command to cisco.nxos.nxos
failed: [localhost] (item={'name': 'fabric_b', 'switch_fabric': 'mhxcissan001sas', 'vsan_id': 2014}) => {
    "ansible_loop_var": "fab",
    "changed": false,
    "fab": {
        "name": "fabric_b",
        "switch_fabric": "mhxcissan001sas",
        "vsan_id": 2014
    },
    "module_stderr": "clear zone lock vsan 2014: CLI execution error: Command will clear lock from the entire fabric only if issued on initiating switch.\nElse lock will be cleared only locally.\nVSAN 2014 is not active\n",
    "module_stdout": "",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error"
}

PLAY RECAP ****************************************************************************************************************************************************************************************************************************************************************
localhost                  : ok=22   changed=0    unreachable=0    failed=1    skipped=4    rescued=0    ignored=0  

The message 'VSAN 2014 is not active' on the second fabric is unrelated to current action, and unexpected.

When I run the command ""clear zone lock vsan 2014" manually on the switch mhxcissan001sas, I get:

mhxcissan001sas# clear zone lock vsan 2014
Command will clear lock from the entire fabric only if issued on initiating switch.
Else lock will be cleared only locally.
Do you want to continue? (y/n) [n] y
No pending info found
mhxcissan001sas#

'VSAN 2014 is not active' should not show up when executing the command via the API and Ansible.

colinet avatar Oct 10 '23 13:10 colinet

@colinet Just for my understanding, which connection type are you using to run the original task with the prompt and answer?

I'am using API connexion type. For the above playbook, I have: san_CRUD_switch_fabric_api: ansible.netcommon.httpapi

I don't think prompts will ever work with NX-API due to the very nature of HTTP. Have you tried doing the same thing via the NX-API sandbox? Does it work there?

NilashishC avatar Oct 10 '23 13:10 NilashishC

@colinet Just for my understanding, which connection type are you using to run the original task with the prompt and answer?

I'am using API connexion type. For the above playbook, I have: san_CRUD_switch_fabric_api: ansible.netcommon.httpapi

I don't think prompts will ever work with NX-API due to the very nature of HTTP. Have you tried doing the same thing via the NX-API sandbox? Does it work there?

I'am fine with '- terminal dont-ask' 1st command (and forget about prompt through NX-API). This is running on Fabric A. But the command fails on Fabric B with "VSAN 2014 is not active" despite this VSAN is active.

colinet avatar Oct 10 '23 13:10 colinet

On fabric B where the error related to VSAN 2014 not being active, the state is :

mhxcissan001sas# show vsan 2014
vsan 2014 information
         name:VSAN2014  state:active
         interoperability mode:default
         loadbalancing:src-id/dst-id/oxid
         operational state:up

mhxcissan001sas#

colinet avatar Oct 10 '23 14:10 colinet

@colinet does the accounting log on mhxcissan001sas show failure after running the playbook? (show accounting log | i clear ) also what is the mhxcissan001sas switch version and model?

srbharadwaj avatar Oct 10 '23 17:10 srbharadwaj