community.routeros icon indicating copy to clipboard operation
community.routeros copied to clipboard

Duplicate data in stdout and facts starting with RouterOS 7.2

Open markwell-ch opened this issue 2 years ago • 26 comments

SUMMARY

Starting with Router OS 7.2 stdout of community.routeros.command contains the output twice. And when running community.routeros.facts against a device with RouterOS 7.2 or newer (tested up to RouterOS 7.6) hostvars[inventory_hostname]['ansible_net_config'] contains the config twice.

ISSUE TYPE
  • Bug Report
COMPONENT NAME

community.routeros.facts community.routeros.command

ANSIBLE VERSION
ansible [core 2.13.6]
  config file = /home/markus/gits/ansible-speedcom-backbone/ansible.cfg
  configured module search path = ['/home/markus/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/markus/gits/ansible-speedcom-backbone/.venv/lib/python3.10/site-packages/ansible
  ansible collection location = /home/markus/gits/ansible-speedcom-backbone/.ansible/collections
  executable location = /home/markus/gits/ansible-speedcom-backbone/.venv/bin/ansible
  python version = 3.10.6 (main, Nov  2 2022, 18:53:38) [GCC 11.3.0]
  jinja version = 3.1.2
  libyaml = True

COLLECTION VERSION
Collection         Version
------------------ -------
community.routeros 2.3.1 
OS / ENVIRONMENT

RouterOS>=7.2

STEPS TO REPRODUCE
    - name: Collect only the config and default facts
      community.routeros.facts:
        gather_subset:
          - "config"

    - name: Save Config Backup
      ansible.builtin.copy:
        content: "{{ hostvars[inventory_hostname]['ansible_net_config'] }}"
        dest: "{{ backup_dir }}/{{ inventory_hostname }}_config.{{ lookup('pipe', 'date +%Y%m%d') }}.txt"
      delegate_to: localhost

    - name: Run command
      community.routeros.command:
        commands: /system/routerboard/print
      register: config

    - name: Save Output to File
      ansible.builtin.copy:
        content: "{{ config.stdout_lines }}"
        dest: "{{ backup_dir }}/{{ inventory_hostname }}_routerboard_print.{{ lookup('pipe', 'date +%Y%m%d') }}.txt"
      delegate_to: localhost
EXPECTED RESULTS

stdout and hostvars[inventory_hostname]['ansible_net_config'] should not contain duplicate data.

ACTUAL RESULTS

stdout and hostvars[inventory_hostname]['ansible_net_config'] contains duplicate data.

markwell-ch avatar Nov 15 '22 22:11 markwell-ch

Reminds me a bit of #62. I unfortunately don't have ROS 7 (I'm still waiting for a ROS 7 device I ordered earlier this year, damn supply chain problems :D ).

You might want to use the API modules instead if possible.

felixfontein avatar Nov 16 '22 05:11 felixfontein

Thanks for your reply. Yes, the supply chain problems suck. In the meantime you could also run a VM with the Cloud Hosted Router image, see: https://help.mikrotik.com/docs/pages/viewpage.action?pageId=18350234 or even some unofficial docker image that runs RouterOS inside a container: https://hub.docker.com/r/evilfreelancer/docker-routeros

markwell-ch avatar Nov 17 '22 13:11 markwell-ch

I noticed the same issue earlier and tried to locate where in the code the problem could be, but I haven't been successful so far.

I also noticed that the behavior isn't persistent. It's radnom - mostly it is duplicated, but occasionally it is not.

What's even worse is that the duplicated output is sometime merged into one line and if a more complex script is passed as the command with new line characters in it, the output of the command is randomly mixed together with the terminal prompt and the command itself and all is duplicated.

The only thing I made sure so far is, that the command is not executed twice. It's duplicate output of the one command execution.

therfert avatar Nov 17 '22 19:11 therfert

I'm not sure anymore where exactly all the parsing magic happens, but I think it was here or in one of the callers: https://github.com/ansible-collections/ansible.netcommon/blob/main/plugins/connection/network_cli.py#L1176

Last time I've tried debugging something like that I used q (https://github.com/zestyping/q) to dump all kind of stuff in the potentially interesting functions to try to figure out what happens.

felixfontein avatar Nov 17 '22 20:11 felixfontein

For me the duplication does not happen when ssh_type = paramiko instead of ssh_type = sshlib is used. Important is adding this options to the username ansible_user: username+cte1000w1000h. Maybe smaller numbers will also work. See: https://help.mikrotik.com/docs/display/ROS/Command+Line+Interface#CommandLineInterface-LoginOptions

markwell-ch avatar Dec 01 '22 20:12 markwell-ch

You might want to use the API modules instead if possible.

Good time of day. Does RouterOS v7 work correctly with the api modules?

izzzhoga avatar Dec 12 '22 05:12 izzzhoga

I have not heard anything to the contrary.

felixfontein avatar Dec 12 '22 05:12 felixfontein

I have exactly the same issue (routeros 7.5 and 7.6). I can also reproduce that the duplicate output just happens with ssh_type = libssh setting, ssh_type = paramiko looks "normal". I hopefully can try the api implementation soon, at the moment my whole setup is based on ssh. Thanks for any help.

dmqgit avatar Dec 20 '22 22:12 dmqgit

Hi there, I think I can reproduce that behavior with community.routeros.command. I have that behavior with one or two command. Target: RouterOS 7.6

    - name: Configure wireguard interface
      community.routeros.command:
        commands:                           
          - /interface/list/print where name=ADM
          - /interface/list/member/print where interface={{ admin_network_config.wireguard_ifName }} and list=ADM
      register: tutu
      tags:                     
        - toto
               
    - debug:                                                                                                                         
        var: tutu                                                                 
      tags: toto

output:

❯ ansible-playbook -i inventory.ini -t toto mikrotik_wg.yml

PLAY [Install certificates] ************************************************************

TASK [Configure wireguard interface] ***************************************************
changed: [toto]

TASK [debug] ***************************************************************************
ok: [pvc.tiersbox.org] => {
    "tutu": {
        "changed": true,
        "failed": false,
        "stdout": [
            "Columns: NAME\n# NAME\n0 ADMColumns: NAME\n# NAME\n0 ADM",
            "/interface/list/member/print where interface=wg-admin and list=AD\n<nt where interface=wg-admin and list=ADM                                      \n<nt where interface=wg-admin and list=ADM\n<nt where interface=wg-admin and list=ADM\n\nColumns: LIST, INTERFACE\n# LIST  INTERFACE\n0 ADM   wg-admin/interface/list/member/print where interface=wg-admin and list=AD\n<nt where interface=wg-admin and list=ADM                                      \n<nt where interface=wg-admin and list=ADM\n<nt where interface=wg-admin and list=ADM\n\nColumns: LIST, INTERFACE\n# LIST  INTERFACE\n0 ADM   wg-admin"
        ],
        "stdout_lines": [
            [
                "Columns: NAME",
                "# NAME",
                "0 ADMColumns: NAME",
                "# NAME",
                "0 ADM"
            ],
            [
                "/interface/list/member/print where interface=wg-admin and list=AD",
                "<nt where interface=wg-admin and list=ADM                                      ",
                "<nt where interface=wg-admin and list=ADM",
                "<nt where interface=wg-admin and list=ADM",
                "",
                "Columns: LIST, INTERFACE",
                "# LIST  INTERFACE",
                "0 ADM   wg-admin/interface/list/member/print where interface=wg-admin and list=AD",
                "<nt where interface=wg-admin and list=ADM                                      ",
                "<nt where interface=wg-admin and list=ADM",
                "<nt where interface=wg-admin and list=ADM",
                "",
                "Columns: LIST, INTERFACE",
                "# LIST  INTERFACE",
                "0 ADM   wg-admin"
            ]
        ]
    }
}

PLAY RECAP *****************************************************************************
toto           : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

AleksNovak avatar Jan 14 '23 22:01 AleksNovak

I unfortunately don't have ROS 7 (I'm still waiting for a ROS 7 device I ordered earlier this year, damn supply chain problems :D ).

You can run ROS 7 for free using Vagrant. The Vagrant box I'm using is cheretbe/routeros7. The doubling issue is occurring there so it should be possible to diagnose it without any additional hardware or cost. Happy to assist if you are unfamiliar with Vagrant.

Vagrantfile:

Vagrant.configure("2") do |config|

  config.vm.define "routeros7" do |routeros|
    routeros.vm.box = "cheretbe/routeros7"
    routeros.vm.hostname = "routeros7.test"
    routeros.vm.network :forwarded_port, guest: 22, host: 2827, id: "ssh"
  end

end
ok: [routeros7.hq.example.com] => {
    "changed": false,
    "invocation": {
        "module_args": {
            "commands": [
                "/put [/system clock get time]"
            ],
            "interval": 1,
            "match": "all",
            "retries": 10,
            "wait_for": null
        }
    },
    "stdout": [
        "00:54:2600:54:26"
    ],
    "stdout_lines": [
        [
            "00:54:2600:54:26"
        ]
    ]
}

o0-o avatar Feb 16 '23 01:02 o0-o

On RouterOS 7.6 I can reproduce this with the community.routeros.command module as well:

---
- name: RouterOS test with network_cli connection
  hosts: routers[0]
  gather_facts: false
  tasks:

  - name: Gather statistics
    community.routeros.command:
      commands:
        - "/interface print stats from=wlan1,wlan2"
    register: print_stats

  - name: Show system resources
    debug:
      var: print_stats

  - name: Create log directory if it does not exist
    file:
      path: ./logs
      state: directory
      mode: '0755'

  - local_action: lineinfile line={{ print_stats | to_nice_yaml}} create=true path="./logs/{{ inventory_hostname }}.log"
TASK [Show system resources] ****************************************************************************
ok: [192.168.179.231] => 
  msg:
    ansible_facts:
      discovered_interpreter_python: /usr/bin/python3
    changed: true
    failed: false
    stdout:
    - |-
      Flags: D - dynamic; X - disabled, R - running; S - slave; P - passthrough
       #      NAME              RX-BYTE           TX-BYTE     RX-PACKET     TX-PACKET
       0  RS  ;;; managed by CAPsMAN
              ;;; channel: 2462/20-eC/gn(14dBm), SSID: X_24, local forwarding
              wlan1          46 654 072    12 532 162 795       198 471    24 568 945
       1  RS  ;;; managed by CAPsMAN
              ;;; channel: 5180/20-Ceee/ac(11dBm), SSID: X, local forwarding
              wlan2       1 093 050 444    16 045 504 342     2 364 491    28 825 963Flags: D - dynamic; X - disabled, R - running; S - slave; P - passthrough
       #      NAME              RX-BYTE           TX-BYTE     RX-PACKET     TX-PACKET
       0  RS  ;;; managed by CAPsMAN
              ;;; channel: 2462/20-eC/gn(14dBm), SSID: X_24, local forwarding
              wlan1          46 654 072    12 532 162 795       198 471    24 568 945
       1  RS  ;;; managed by CAPsMAN
              ;;; channel: 5180/20-Ceee/ac(11dBm), SSID: X, local forwarding
              wlan2       1 093 050 444    16 045 504 342     2 364 491    28 825 963
    stdout_lines:
    - - 'Flags: D - dynamic; X - disabled, R - running; S - slave; P - passthrough '
      - ' #      NAME              RX-BYTE           TX-BYTE     RX-PACKET     TX-PACKET'
      - ' 0  RS  ;;; managed by CAPsMAN'
      - '        ;;; channel: 2462/20-eC/gn(14dBm), SSID: X_24, local forwarding'
      - '        wlan1          46 654 072    12 532 162 795       198 471    24 568 945'
      - ' 1  RS  ;;; managed by CAPsMAN'
      - '        ;;; channel: 5180/20-Ceee/ac(11dBm), SSID: X, local forwarding'
      - '        wlan2       1 093 050 444    16 045 504 342     2 364 491    28 825 963Flags: D - dynamic; X - disabled, R - running; S - slave; P - passthrough '
      - ' #      NAME              RX-BYTE           TX-BYTE     RX-PACKET     TX-PACKET'
      - ' 0  RS  ;;; managed by CAPsMAN'
      - '        ;;; channel: 2462/20-eC/gn(14dBm), SSID: X_24, local forwarding'
      - '        wlan1          46 654 072    12 532 162 795       198 471    24 568 945'
      - ' 1  RS  ;;; managed by CAPsMAN'
      - '        ;;; channel: 5180/20-Ceee/ac(11dBm), SSID: X, local forwarding'
      - '        wlan2       1 093 050 444    16 045 504 342     2 364 491    28 825 963'

kar0v avatar Feb 27 '23 14:02 kar0v

I am pretty confident that it is an upstream issue. It definitely relates to libssh although I suppose the bug could be in network_cli or elsewhere. I don't have a good understanding of how network modules like this work. It will likely be up to one of the maintainers here to investigate and file the appropriate issue upstream.

I have switched to using paramiko and have had no problems. However, my use of Ansible with Mikrotik devices is minimal so your mileage may vary.

o0-o avatar Feb 27 '23 17:02 o0-o

I've finally got my ROS7 device, I'll see whether I can debug this a bit next weekend or so...

felixfontein avatar Feb 27 '23 21:02 felixfontein

I think I managed to figure out what goes wrong; it seems to be a bug in the ansible.netcommon.network_cli connection plugin: https://github.com/ansible-collections/ansible.netcommon/pull/517

When applying that change the duplicate output no longer happens for me.

felixfontein avatar Feb 28 '23 22:02 felixfontein

I took a look through everything, and while the fix does work, I don't think https://github.com/ansible-collections/ansible.netcommon/pull/517 is quite the right answer. Can someone test https://github.com/ansible-collections/ansible.netcommon/pull/521 and let me know if that works as well?

Qalthos avatar Mar 01 '23 21:03 Qalthos

@Qalthos I tested https://github.com/ansible-collections/ansible.netcommon/pull/521, it also fixes the problem for me!

felixfontein avatar Mar 01 '23 21:03 felixfontein

Cool. Thanks for looking into it. I've closed 517 in favor of 521 and will merge 521 when I get a chance.

Qalthos avatar Mar 01 '23 21:03 Qalthos

@Qalthos cool, thanks a lot! Will it be possible to do a new 4.x.y bugfix release once that is merged?

felixfontein avatar Mar 02 '23 05:03 felixfontein

We won't be able to do that, unfortunately.

Qalthos avatar Mar 02 '23 16:03 Qalthos

I have exactly the same problem with duplicating data, only on RouterOS 7.8

Haven't fixed the bug yet?

izzzhoga avatar Mar 09 '23 09:03 izzzhoga

I dare to assume that because of this problem, the collection of facts on RouterOS 7.2-7.8 is broken, the example below.

- name: Collect all facts from the device
  community.routeros.facts:
    gather_subset: all
  register: all_facts

The error I'm getting:

fatal: [mikrotik]: FAILED! => {
    "changed": false,
    "module_stderr": "Traceback (most recent call last):\n  File \"/home/user/.ansible/tmp/ansible-local-3253737c37dt00n/ansible-tmp-1678356214.2582262-3254449-220312769211413/AnsiballZ_facts.py\", line 107, in <module>\n    _ansiballz_main()\n  File \"/home/user/.ansible/tmp/ansible-local-3253737c37dt00n/ansible-tmp-1678356214.2582262-3254449-220312769211413/AnsiballZ_facts.py\", line 99, in _ansiballz_main\n    invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n  File \"/home/user/.ansible/tmp/ansible-local-3253737c37dt00n/ansible-tmp-1678356214.2582262-3254449-220312769211413/AnsiballZ_facts.py\", line 47, in invoke_module\n    runpy.run_module(mod_name='ansible_collections.community.routeros.plugins.modules.facts', init_globals=dict(_module_fqn='ansible_collections.community.routeros.plugins.modules.facts', _modlib_path=modlib_path),\n  File \"/usr/lib/python3.9/runpy.py\", line 210, in run_module\n    return _run_module_code(code, init_globals, run_name, mod_spec)\n  File \"/usr/lib/python3.9/runpy.py\", line 97, in _run_module_code\n    _run_code(code, mod_globals, init_globals,\n  File \"/usr/lib/python3.9/runpy.py\", line 87, in _run_code\n    exec(code, run_globals)\n  File \"/tmp/ansible_community.routeros.facts_payload_n6n95orf/ansible_community.routeros.facts_payload.zip/ansible_collections/community/routeros/plugins/modules/facts.py\", line 663, in <module>\n  File \"/tmp/ansible_community.routeros.facts_payload_n6n95orf/ansible_community.routeros.facts_payload.zip/ansible_collections/community/routeros/plugins/modules/facts.py\", line 651, in main\n  File \"/tmp/ansible_community.routeros.facts_payload_n6n95orf/ansible_community.routeros.facts_payload.zip/ansible_collections/community/routeros/plugins/modules/facts.py\", line 455, in populate\n  File \"/tmp/ansible_community.routeros.facts_payload_n6n95orf/ansible_community.routeros.facts_payload.zip/ansible_collections/community/routeros/plugins/modules/facts.py\", line 532, in parse_route\n  File \"/tmp/ansible_community.routeros.facts_payload_n6n95orf/ansible_community.routeros.facts_payload.zip/ansible_collections/community/routeros/plugins/modules/facts.py\", line 474, in preprocess\nIndexError: list index out of range\n",
    "module_stdout": "",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 1
}

Is it necessary to open a new bug report for this?

izzzhoga avatar Mar 09 '23 10:03 izzzhoga

@izzzhoga while https://github.com/ansible-collections/ansible.netcommon/pull/521 has been merged, no new release of ansible.netcommon has been made. Once a new release is out and you installed that one, the problem should be gone. If you want to have it fixed earlier, you need to patch your local ansible.netcommon install accordingly.

The problem you have with facts gathering could also be caused by this. If the problem still happens with that fix applied, opening a new issue would help (then it's a problem in this collection).

felixfontein avatar Mar 09 '23 11:03 felixfontein

@felixfontein Please tell me when the new version of the module is planned to be released? :)

izzzhoga avatar Mar 09 '23 11:03 izzzhoga

@izzzhoga I do not know when the ansible.netcommon collection will have another release.

felixfontein avatar Mar 09 '23 11:03 felixfontein

ansible.netcommon 5.1.0 has been released (https://github.com/ansible-collections/ansible.netcommon/tree/5.1.0), it should contain that fix. (The changelog does mention it: https://github.com/ansible-collections/ansible.netcommon/blob/main/CHANGELOG.rst#v510)

felixfontein avatar Apr 05 '23 17:04 felixfontein

(If you are using Ansible (and not ansible-core with your own set of collections), you will have to wait for Ansible 8.0.0 though since it is a newer major version than the version included in 7.x.y.)

felixfontein avatar Apr 05 '23 17:04 felixfontein