community.routeros
community.routeros copied to clipboard
Command timeout on RouterOS 7
SUMMARY
Dear Community!
I have a weird issue. I have some Mikrotik devices, these are using my ansible backup playbook, to do a backup once a week.
Some of them now fails to do the backup (one site). Some devices were gave up during a storm recently, but some are still the same as before. New, identical devices were installed, backup was applied to them.
Around since then, the backup script is not working at this site. On all of the devices.
SSH is working correctly, I can log-in from the server to these devices. API connection is working correctly, even in Ansible.
I can see that Ansible can log-in, key is accepted, access is granted. But after that, nothing happens basically.
Login attempt by Ansible:
In Ansible, I can't find anything suspicious. It just times out, as it is unable to reach the destination...:
<10.0.13.1> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-local-2069615kcu5ya9/ansible-tmp-1697089639.2060575-207982-15282461545527/ > /dev/null 2>&1 && sleep 0'
The full traceback is:
File "/tmp/ansible_community.network.routeros_command_payload_yfp95ui9/ansible_community.network.routeros_command_payload.zip/ansible_collections/community/routeros/plugins/module_utils/routeros.py", line 51, in get_capabilities
capabilities = Connection(module._socket_path).get_capabilities()
File "/tmp/ansible_community.network.routeros_command_payload_yfp95ui9/ansible_community.network.routeros_command_payload.zip/ansible/module_utils/connection.py", line 200, in __rpc__
raise ConnectionError(to_text(msg, errors='surrogate_then_replace'), code=code)
fatal: [ayc-sw3]: FAILED! => {
"changed": false,
"invocation": {
"module_args": {
"commands": [
"/system/leds/set 0 type=on"
],
"interval": 1,
"match": "all",
"retries": 10,
"wait_for": null
}
},
"msg": "command timeout triggered, timeout value is 60 secs.\nSee the timeout setting options in the Network Debug and Troubleshooting Guide."
}
The full traceback is:
File "/tmp/ansible_community.network.routeros_command_payload_6mq7m511/ansible_community.network.routeros_command_payload.zip/ansible_collections/community/routeros/plugins/module_utils/routeros.py", line 51, in get_capabilities
capabilities = Connection(module._socket_path).get_capabilities()
File "/tmp/ansible_community.network.routeros_command_payload_6mq7m511/ansible_community.network.routeros_command_payload.zip/ansible/module_utils/connection.py", line 200, in __rpc__
raise ConnectionError(to_text(msg, errors='surrogate_then_replace'), code=code)
fatal: [ayc-gw1]: FAILED! => {
"changed": false,
"invocation": {
"module_args": {
"commands": [
"/system/leds/set 0 type=on"
],
"interval": 1,
"match": "all",
"retries": 10,
"wait_for": null
}
},
"msg": "command timeout triggered, timeout value is 60 secs.\nSee the timeout setting options in the Network Debug and Troubleshooting Guide."
}
ISSUE TYPE
- Bug Report
COMPONENT NAME
community.network.routeros_command
ANSIBLE VERSION
ansible [core 2.14.3]
config file = /etc/ansible/ansible.cfg
configured module search path = ['/home/kristof/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /home/kristof/.local/lib/python3.10/site-packages/ansible
ansible collection location = /home/kristof/.ansible/collections:/usr/share/ansible/collections
executable location = /home/kristof/.local/bin/ansible
python version = 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] (/usr/bin/python3)
jinja version = 3.0.3
libyaml = True
COLLECTION VERSION
community.routeros 2.9.0
I found the issue: I have added parenthesis in all devices identity at this location. So name convenction some something like: location-sw1-(rack1)
That caused ssh to work improperly. I did not had time yet to check, but I suspect that this is not escaped / parsed incorrectly when receiving response.
Maybe it's related to prompt detection, or something like that. (For my personal use, I started only using the SSH modules to set up API via HTTPS and then only use that.)
Yes, same for me, 99% I use API where I can, though there are some cases where it is not implemented in API yet or even not possible through API. For the record, this playbook creates an export and a backup of the config and pushes it to my server through FTP.
There are no endpoints for these operations. It might be better to create a script/scheduler on device for this, but I did not like that before when I tried, it is easier to manage this centrally (at least for me).
IIRC there is an API endpoint for exporting the config, but you can only write it to a file on the router's filesystem. Then you have to use something like net_get to download the file. (At least that's what I wrote a longer time ago when working on the api_facts
module: https://github.com/ansible-collections/community.routeros/pull/88#issuecomment-1121876460 - I don't remember anymore how exactly to use the api
module for it.)
Have same issue. Any workaround for this?
Linking this page for reference: How to connect to RouterOS devices with SSH.
It specifies that device names can only use alphanumeric characters, underscores and dashes.
Another big one is the need to add +cet512w
to the end of the username (like admin+cet512w
). Without this, if your commands are too long, it will produce the same command timeout error.
@stasstryukov if you're still having this issue, give this page a glance and see if that resolves it for you.
If there are commas in the Identity, a timeout issue occurs