mitogen icon indicating copy to clipboard operation
mitogen copied to clipboard

apt: AttributeError: module '__main__' has no attribute '_module_fqn'

Open anxstj opened this issue 2 years ago • 28 comments

  • Which version of Ansible are you running?
ansible [core 2.11.2] 
  config file = ~/.ansible.cfg
  configured module search path = ['~/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/venv/lib/python3.9/site-packages/ansible
  ansible collection location = ~/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/local/venv/bin/ansible
  python version = 3.9.6 (default, Jul 16 2021, 00:00:00) [GCC 11.1.1 20210531 (Red Hat 11.1.1-3)]
  jinja version = 3.0.1
  libyaml = True
  • Is your version of Ansible patched in any way? no
  • Are you running with any custom modules, or module_utils loaded? no
  • Have you tried the latest master version from Git? yes
  • Mention your host and target OS and versions
    • host: Fedora 34
    • target: Debian 10 docker image
  • Mention your host and target Python versions
    • host: Python 3.9
    • target: Python 3.7

Example Playbook:

---
- name: Converge
  hosts: all

  tasks:
    - name: install vim
      ansible.builtin.apt:
        name: vim
        state: present

Error:

The full traceback is:
Traceback (most recent call last):
  File "master:/usr/local/share/mitogen/ansible_mitogen/runner.py", line 975, in _run
    self._run_code(code, mod)
  File "master:/usr/local/share/mitogen/ansible_mitogen/runner.py", line 939, in _run_code
    exec(code, vars(mod))
  File "master:/usr/local/venv/lib/python3.9/site-packages/ansible/modules/apt.py", line 1310, in <module>
  File "master:/usr/local/venv/lib/python3.9/site-packages/ansible/modules/apt.py", line 1114, in main
  File "master:/usr/local/venv/lib/python3.9/site-packages/ansible/module_utils/common/respawn.py", line 39, in respawn_module
    payload = _create_payload()
  File "master:/usr/local/venv/lib/python3.9/site-packages/ansible/module_utils/common/respawn.py", line 76, in _create_payload
    module_fqn = sys.modules['__main__']._module_fqn
AttributeError: module '__main__' has no attribute '_module_fqn'
fatal: [linux-debian-10]: FAILED! => {
    "ansible_facts": {},
    "changed": false,
    "module_stderr": "Traceback (most recent call last):\n  File \"master:/usr/local/share/mitogen/ansible_mitogen/runner.py\", line 975, in _run\n    self._run_code(code, mod)\n  File \"master:/usr/local/share/mitogen/ansible_mitogen/runner.py\", line 939, in _run_code\n    exec(code, vars(mod))\n  File \"master:/usr/local/venv/lib/python3.9/site-packages/ansible/modules/apt.py\", line 1310, in <module>\n  File \"master:/usr/local/venv/lib/python3.9/site-packages/ansible/modules/apt.py\", line 1114, in main\n  File \"master:/usr/local/venv/lib/python3.9/site-packages/ansible/module_utils/common/respawn.py\", line 39, in respawn_module\n    payload = _create_payload()\n  File \"master:/usr/local/venv/lib/python3.9/site-packages/ansible/module_utils/common/respawn.py\", line 76, in _create_payload\n    module_fqn = sys.modules['__main__']._module_fqn\nAttributeError: module '__main__' has no attribute '_module_fqn'\n",
    "module_stdout": "",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 1
}

Verbose output: verbose.log.txt

Workarounds like:

  1. Adding vars: { mitogen_task_isolation: fork } to the task
  2. Adding ansible.builtin.apt to ALWAYS_FORK_MODULES in ansible_mitogen/planner.py

doesn't work.

anxstj avatar Aug 31 '21 08:08 anxstj

I'm trying to look into this. The issue seems to be caused by the introduction of the module_respawn API in ansible in commit 4c5ce5a1a9e79a845aff4978cfeb72a0d4ecf7d6, an in particular by the change on line 197 of module_common.py which changes the invocation of ansible's run_module() from

runpy.run_module(mod_name='%(module_fqn)s', init_globals=None, run_name='__main__', alter_sys=True)

to

runpy.run_module(mod_name='%(module_fqn)s', init_globals=dict(_module_fqn='%(module_fqn)s', _modlib_path=modlib_path),
                         run_name='__main__', alter_sys=True)

In particular, it seems that mitogen_ansible doesn't insert _module_fqn into the target module's globals.

My guess is that that attribute should be inserted into the target module somewhere around here https://github.com/mitogen-hq/mitogen/blob/master/ansible_mitogen/runner.py#L970 but simply adding

mod._module_fqn = self.py_module_name

there has no effect.

I'm lacking knowledge of python internals in order to get mitogen to insert this _module_fqn attribute (and _modlb_path, which is used in the next line) into the __main__ module of the executed script.

baszoetekouw avatar Oct 28 '21 19:10 baszoetekouw

can confirm the lastest master still has this problem

Links2004 avatar Nov 23 '21 09:11 Links2004

Happens on a Fedora 35 controller node w/ OracleLinux 8.5 managed nodes using the dnf module as well. The funny thing is: it happens randomly, not everytime.

philfry avatar Nov 29 '21 14:11 philfry

My guess is that that attribute should be inserted into the target module somewhere around here https://github.com/mitogen-hq/mitogen/blob/master/ansible_mitogen/runner.py#L970 but simply adding

mod._module_fqn = self.py_module_name

there has no effect.

I put it here. Seems to have addressed the issue. https://github.com/mitogen-hq/mitogen/blob/a564d8a268afe89ab73423f23fb9191c786f511f/ansible_mitogen/runner.py#L526-L528

phemmer avatar Dec 07 '21 19:12 phemmer

@phemmer What exactly have you inserted there? This code doesn't even seem to be called when running a module such as apt (at least, putting a generic exception there had no effect whatsoever).

baszoetekouw avatar Dec 07 '21 21:12 baszoetekouw

I think this issue is triggered when the ansible_python_interpreter being used on the target is not the system python. The ansible modules that deal with packages require python libraries that are not installable with pip, and can't reasonable be streamed out by mitogen because of that. In the ansible module code, there's a breakout were it will respawn into the system python in order to find these libraries: https://github.com/ansible/ansible/blob/6e57c8c0844c44a102dc081b93d5299cae9619e2/lib/ansible/module_utils/common/respawn.py#L18 And I suspect something about that causes the problem. For anybody else having this issue, if you 1) make sure ansible is using the system python on the far end, and 2) perhaps - make sure that you don't have the python module locally so that mitogen doesnt try to send it; does the issue go away?

mhenniges avatar Jan 06 '22 18:01 mhenniges

@mhenniges I think the situation you describe is correct, but it still should be supported by mitogen, I think. I don't think it is hard to solve (you need to insert two globals in the target module's main namespace), I only don't know enough about the mitogen internals to determine how/where to do that.

baszoetekouw avatar Jan 10 '22 10:01 baszoetekouw

@mhenniges Explicitly setting ansible_python_interpreter=/usr/bin/python (or auto_legacy) for the Debian 10 container circumvents the problem.

Setting it to /usr/bin/python3 (or auto) results in the same error.

I'm testing this with molecule and it sets interpreter_python = auto_silent per default, so that's probably the reason why it is breaking in my case.

anxstj avatar Jan 21 '22 20:01 anxstj

I have INTERPRETER_PYTHON set to auto on the OEL 8.x managed node. Even though there is only one python version installed, there are lots of symlinks that are needed to be followed for the final binary:

/usr/bin/python3 → 
/etc/alternatives/python3 →
/usr/bin/python3.6 →
/usr/libexec/platform-python3.6

Ansible uses /usr/libexec/platform-python though, which is a symlink to /usr/libexec/platform-python3.6, because of the distro map:

INTERPRETER_PYTHON_DISTRO_MAP(default) = {
  'centos': {
    '6': '/usr/bin/python',
    '8': '/usr/libexec/platform-python',
    '9': '/usr/bin/python3'
  },
  'debian': {
    '8': '/usr/bin/python',
    '10': '/usr/bin/python3'
  },
  'fedora': {
    '23': '/usr/bin/python3'
  },
  'oracle': {
    '6': '/usr/bin/python',
    '8': '/usr/libexec/platform-python',
    '9': '/usr/bin/python3'
  },
  'redhat': {
    '6': '/usr/bin/python',
    '8': '/usr/libexec/platform-python',
    '9': '/usr/bin/python3'
  },
  'rhel': {
    '6': '/usr/bin/python',
    '8': '/usr/libexec/platform-python',
    '9': '/usr/bin/python3'
  },
  'ubuntu': {
    '14': '/usr/bin/python',
    '16': '/usr/bin/python3'
  }
}

As already mentioned, for me, the error only occurs in the dnf module. And only from time to time.

philfry avatar Jan 27 '22 16:01 philfry

I think you are missing the point of this bug; sure, I'd be nice to have a workaround by setting the python interpreter to a different binary, but that still wouldn't fix this bug.

The problem is that the ansible API has changed, and this breaks specific mitogen workflows. The solution should thus be to fix mitogen, not to avoid the specific circumstances in which this workflow is triggered.

baszoetekouw avatar Jan 31 '22 08:01 baszoetekouw

I'm not sure what has happened, but eveything now Just Work with ansible-core 2.12.2 and mitogen 0.3.2.

baszoetekouw avatar Mar 30 '22 09:03 baszoetekouw

I'm not sure what has happened, but eveything now Just Work with ansible-core 2.12.2 and mitogen 0.3.2.

Worked for few runs, and then started to spit out the same error. Also, some other similar modules, like debconf or dnf fails.

ghost avatar Apr 11 '22 08:04 ghost

I have experienced the same bug when connecting from a Debian unstable system (Debian mitogen 0.3.1-3) to a Debian 11 system. I fixed it by installing on the target these packages (I did not check which one was actually relevant): python3-chardet python3-decorator python3-pkg-resources python3-requests.

rfc1036 avatar May 17 '22 13:05 rfc1036

Same here with ansible 4.x and 5.x, mitogen 3.2. Works with ansible 3.x.

I noticed that it happens only on first apt module invocation on fresh clean destination system - python3-apt package installed automatically before actually installing other packages with apt module. It always works on second run and after because python3-apt already installed. So I think task failed because of apt-related files changed during first run (module reuse on something, idk).

Confirmed workaround - install python3-apt package on destination system manually or via shell module before first apt module invocation:

- name: debian.yml | https://github.com/mitogen-hq/mitogen/issues/849 workaround
  shell:
    cmd: apt update && apt install -y python3-apt

zigmund avatar Jun 08 '22 05:06 zigmund

Hello guys,

any news here? Will that be fixed?

Thanks.

llabusch93 avatar Jun 16 '22 09:06 llabusch93

Having same issue with Ansible + Mitogen :-(

shurrman avatar Jul 14 '22 12:07 shurrman

just one consideration: are you checking that the facts are not cached between one experiment and the next?

dberardo-com avatar Jul 21 '22 15:07 dberardo-com

Notes to self from reproduction attempts.

Attempt 1: macOS -> Debian 11.4, python3-apt installed, rev 8cda5f55375e3c5caa77d900d1bfa90c5fea0094, Ansible 5, negative

Playbook

- name: Module with FQN
  hosts: rpi1
  become: true
  strategy: mitogen_linear
  tasks:
    - name: install vim
      ansible.builtin.apt:
        name: vim
        state: present

- name: Module without FQN
  hosts: rpi1
  become: true
  strategy: mitogen_linear
  tasks:
    - name: install vim
      apt:
        name: vim
        state: present
➜  mitogen git:(master) ✗ ANSIBLE_STRATEGY_PLUGINS=ansible_mitogen/plugins/strategy .tox/py310-mode_ansible-ansible4/bin/ansible-playbook issue849.yml
/Users/alex/src/mitogen/.tox/py310-mode_ansible-ansible4/lib/python3.10/site-packages/paramiko/transport.py:169: CryptographyDeprecationWarning: Blowfish has been deprecated
  'class': algorithms.Blowfish,

PLAY [Module with FQN] *************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************
ok: [rpi1]

TASK [install vim] *****************************************************************************************************
changed: [rpi1]

PLAY [Module without FQN] **********************************************************************************************

TASK [Gathering Facts] *************************************************************************************************
ok: [rpi1]

TASK [install vim] *****************************************************************************************************
ok: [rpi1]

PLAY RECAP *************************************************************************************************************
rpi1                       : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

➜  mitogen git:(master) ✗ ANSIBLE_STRATEGY_PLUGINS=ansible_mitogen/plugins/strategy .tox/py310-mode_ansible-ansible4/bin/ansible-playbook --version
/Users/alex/src/mitogen/.tox/py310-mode_ansible-ansible4/lib/python3.10/site-packages/paramiko/transport.py:169: CryptographyDeprecationWarning: Blowfish has been deprecated
  'class': algorithms.Blowfish,
ansible-playbook [core 2.11.12] 
  config file = /Users/alex/.ansible.cfg
  configured module search path = ['/Users/alex/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /Users/alex/src/mitogen/.tox/py310-mode_ansible-ansible4/lib/python3.10/site-packages/ansible
  ansible collection location = /Users/alex/.ansible/collections:/usr/share/ansible/collections
  executable location = .tox/py310-mode_ansible-ansible4/bin/ansible-playbook
  python version = 3.10.5 (main, Jun 23 2022, 17:14:57) [Clang 13.1.6 (clang-1316.0.21.2.5)]
  jinja version = 3.1.2
  libyaml = False
Attempt 2: macOS -> Debian 11.4, python3-apt removed, rev 8cda5f55375e3c5caa77d900d1bfa90c5fea0094, Ansible 5, reproduced
pi@rpi1:~ $ sudo apt remove python3-apt
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages were automatically installed and are no longer required:
  python-apt-common python3-debconf python3-distro-info
Use 'sudo apt autoremove' to remove them.
The following packages will be REMOVED:
  apt-listchanges python3-apt unattended-upgrades
0 upgraded, 0 newly installed, 3 to remove and 41 not upgraded.
After this operation, 1,488 kB disk space will be freed.
Do you want to continue? [Y/n] y
(Reading database ... 126219 files and directories currently installed.)
Removing apt-listchanges (3.24) ...
Removing unattended-upgrades (2.8) ...
Removing python3-apt (2.2.1) ...
Processing triggers for man-db (2.9.4-2) ...
pi@rpi1:~ $ sudo apt autoremove 
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages will be REMOVED:
  python-apt-common python3-debconf python3-distro-info
0 upgraded, 0 newly installed, 3 to remove and 41 not upgraded.
After this operation, 624 kB disk space will be freed.
Do you want to continue? [Y/n] 
(Reading database ... 126102 files and directories currently installed.)
Removing python-apt-common (2.2.1) ...
Removing python3-debconf (1.5.77) ...
Removing python3-distro-info (1.0) ...
pi@rpi1:~ $ ls /usr/lib/python2.7/dist-packages/
lsb_release.py
pi@rpi1:~ $ apt search python2-apt
Sorting... Done
Full Text Search... Done
➜  mitogen git:(master) ✗ ANSIBLE_STRATEGY_PLUGINS=ansible_mitogen/plugins/strategy .tox/py310-mode_ansible-ansible4/bin/ansible-playbook issue849.yml
/Users/alex/src/mitogen/.tox/py310-mode_ansible-ansible4/lib/python3.10/site-packages/paramiko/transport.py:169: CryptographyDeprecationWarning: Blowfish has been deprecated
  'class': algorithms.Blowfish,

PLAY [Module with FQN] *************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************
ok: [rpi1]

TASK [install vim] *****************************************************************************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AttributeError: module '__main__' has no attribute '_module_fqn'
fatal: [rpi1]: FAILED! => changed=false 
  module_stderr: |-
    Traceback (most recent call last):
      File "master:/Users/alex/src/mitogen/ansible_mitogen/runner.py", line 978, in _run
        self._run_code(code, mod)
      File "master:/Users/alex/src/mitogen/ansible_mitogen/runner.py", line 942, in _run_code
        exec(code, vars(mod))
      File "master:/Users/alex/src/mitogen/.tox/py310-mode_ansible-ansible4/lib/python3.10/site-packages/ansible/modules/apt.py", line 1310, in <module>
      File "master:/Users/alex/src/mitogen/.tox/py310-mode_ansible-ansible4/lib/python3.10/site-packages/ansible/modules/apt.py", line 1139, in main
      File "master:/Users/alex/src/mitogen/.tox/py310-mode_ansible-ansible4/lib/python3.10/site-packages/ansible/module_utils/common/respawn.py", line 39, in respawn_module
        payload = _create_payload()
      File "master:/Users/alex/src/mitogen/.tox/py310-mode_ansible-ansible4/lib/python3.10/site-packages/ansible/module_utils/common/respawn.py", line 76, in _create_payload
        module_fqn = sys.modules['__main__']._module_fqn
    AttributeError: module '__main__' has no attribute '_module_fqn'
  module_stdout: ''
  msg: |-
    MODULE FAILURE
    See stdout/stderr for the exact error
  rc: 1

NO MORE HOSTS LEFT *****************************************************************************************************

PLAY RECAP *************************************************************************************************************
rpi1                       : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   

moreati avatar Jul 24 '22 11:07 moreati

In my case, only an upgrade to mitogen 0.3.3 helped when provisioning a Debian 11.4 (5.10.120-1) target system running python 3.9.2. Before that, I had the same problems as described in this thread.

current stack of the performing machine

  • ansible [core 2.12.7]
  • Python 3.9.2

paterik avatar Jul 28 '22 15:07 paterik

Same error when running on Ubuntu 18.04.6 LTS. The system has Ubuntu’s Python 3.8 installed and ansible_python_interpreter set to /usr/bin/python3.8. Even when no other system is involved (Ansible task delegated to localhost), I see the error mentioned above:

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AttributeError: module '__main__' has no attribute '_module_fqn'
fatal: [n0210]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n  File \"master:/usr/local/lib/python3.8/dist-packages/ansible_mitogen/runner.py\", line 978, in _run\n    self._run_code(code, mod)\n  File \"master:/usr/local/lib/python3.8/dist-packages/ansible_mitogen/runner.py\", line 942, in _run_code\n    exec(code, vars(mod))\n  File \"master:/usr/local/lib/python3.8/dist-packages/ansible/modules/apt.py\", line 1387, in <module>\n  File \"master:/usr/local/lib/python3.8/dist-packages/ansible/modules/apt.py\", line 1155, in main\n  File \"master:/usr/local/lib/python3.8/dist-packages/ansible/module_utils/common/respawn.py\", line 39, in respawn_module\n    payload = _create_payload()\n  File \"master:/usr/local/lib/python3.8/dist-packages/ansible/module_utils/common/respawn.py\", line 76, in _create_payload\n    module_fqn = sys.modules['__main__']._module_fqn\nAttributeError: module '__main__' has no attribute '_module_fqn'\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}

Ansible Core is v2.12.9, mitogen is v0.3.3.

stephan2012 avatar Sep 30 '22 14:09 stephan2012

Any update regarding this?

"module_stderr": "Traceback (most recent call last):\n File "master:/runner/project/mitogen-0.3.3/ansible_mitogen/runner.py", line 978, in _run\n self._run_code(code, mod)\n File "master:/runner/project/mitogen-0.3.3/ansible_mitogen/runner.py", line 942, in _run_code\n exec(code, vars(mod))\n File "master:/usr/local/lib/python3.8/site-packages/ansible/modules/dnf.py", line 1427, in \n File "master:/usr/local/lib/python3.8/site-packages/ansible/modules/dnf.py", line 1414, in main\n File "master:/usr/local/lib/python3.8/site-packages/ansible/modules/dnf.py", line 385, in init\n File "master:/usr/local/lib/python3.8/site-packages/ansible/modules/dnf.py", line 578, in _ensure_dnf\n File "master:/usr/local/lib/python3.8/site-packages/ansible/module_utils/common/respawn.py", line 39, in respawn_module\n payload = _create_payload()\n File "master:/usr/local/lib/python3.8/site-packages/ansible/module_utils/common/respawn.py", line 76, in _create_payload\n module_fqn = sys.modules['main']._module_fqn\nAttributeError: module 'main' has no attribute '_module_fqn'\n", "exception": "Traceback (most recent call last):\n File "master:/runner/project/mitogen-0.3.3/ansible_mitogen/runner.py", line 978, in _run\n self._run_code(code, mod)\n File "master:/runner/project/mitogen-0.3.3/ansible_mitogen/runner.py", line 942, in _run_code\n exec(code, vars(mod))\n File "master:/usr/local/lib/python3.8/site-packages/ansible/modules/dnf.py", line 1427, in \n File "master:/usr/local/lib/python3.8/site-packages/ansible/modules/dnf.py", line 1414, in main\n File "master:/usr/local/lib/python3.8/site-packages/ansible/modules/dnf.py", line 385, in init\n File "master:/usr/local/lib/python3.8/site-packages/ansible/modules/dnf.py", line 578, in _ensure_dnf\n File "master:/usr/local/lib/python3.8/site-packages/ansible/module_utils/common/respawn.py", line 39, in respawn_module\n payload = _create_payload()\n File "master:/usr/local/lib/python3.8/site-packages/ansible/module_utils/common/respawn.py", line 76, in _create_payload\n module_fqn = sys.modules['main']._module_fqn\nAttributeError: module 'main' has no attribute '_module_fqn'\n",

Ansible core 2.12.5.post0 mitogen v0.3.3 running on rhel 8.6

It only happens with dnf or yum if are part of the playbook

bitsky6 avatar Oct 28 '22 17:10 bitsky6

Same here with ansible 4.x and 5.x, mitogen 3.2. Works with ansible 3.x.

I noticed that it happens only on first apt module invocation on fresh clean destination system - python3-apt package installed automatically before actually installing other packages with apt module. It always works on second run and after because python3-apt already installed. So I think task failed because of apt-related files changed during first run (module reuse on something, idk).

Confirmed workaround - install python3-apt package on destination system manually or via shell module before first apt module invocation:

- name: debian.yml | https://github.com/mitogen-hq/mitogen/issues/849 workaround
  shell:
    cmd: apt update && apt install -y python3-apt

In the case of debian, manually installing python3-apt as mentioned before any package installation prevents the error from occuring.

Perhaps there is a similar analog for CentOS distros as well?

autolyticus avatar Jan 13 '23 16:01 autolyticus

Same here on ubuntu:20.04 with ansible=5.10.0-1ppa~focal (ansible-core 2.12.10) from ppa:ansible/ansible, mitogen 0.3.3 We use roles with alternative python interpreter

- include: docker.yml
  vars:
    ansible_python_interpreter: "{{ docker_python_interpreter }}"
  tags: [deploy]

output

MODULE_STDERR:

Traceback (most recent call last):
  File "master:/home/gitlab/mitogen-0.3.3/ansible_mitogen/runner.py", line 978, in _run
    self._run_code(code, mod)
  File "master:/home/gitlab/mitogen-0.3.3/ansible_mitogen/runner.py", line 944, in _run_code
    exec('exec code in vars(mod)')
  File "<string>", line 1, in <module>
  File "master:/usr/lib/python3/dist-packages/ansible/modules/apt.py", line 1387, in <module>
  File "master:/usr/lib/python3/dist-packages/ansible/modules/apt.py", line 1155, in main
  File "master:/usr/lib/python3/dist-packages/ansible/module_utils/common/respawn.py", line 39, in respawn_module
    payload = _create_payload()
  File "master:/usr/lib/python3/dist-packages/ansible/module_utils/common/respawn.py", line 76, in _create_payload
    module_fqn = sys.modules['__main__']._module_fqn
AttributeError: 'module' object has no attribute '_module_fqn'

ixvick avatar Feb 22 '23 16:02 ixvick

I played around a bit to dive deeper into why this _module_fqn error happens.

Traceback (most recent call last):
  File "master:.../mitogen/ansible_mitogen/runner.py", line 975, in _run
    self._run_code(code, mod)
  File "master:.../mitogen/ansible_mitogen/runner.py", line 937, in _run_code
    exec(code, vars(mod))
  File "master:/usr/lib/python3.11/site-packages/ansible/modules/dnf.py", line 1468, in <module>
  File "master:/usr/lib/python3.11/site-packages/ansible/modules/dnf.py", line 1455, in main
  File "master:/usr/lib/python3.11/site-packages/ansible/modules/dnf.py", line 401, in __init__
  File "master:/usr/lib/python3.11/site-packages/ansible/modules/dnf.py", line 593, in _ensure_dnf
  File "master:/usr/lib/python3.11/site-packages/ansible/module_utils/common/respawn.py", line 39, in respawn_module
    payload = _create_payload()
  File "master:/usr/lib/python3.11/site-packages/ansible/module_utils/common/respawn.py", line 76, in _create_payload
    module_fqn = sys.modules['__main__']._module_fqn
AttributeError: module '__main__' has no attribute '_module_fqn'

We see that ansible wants to respawn the dnf module using respawn_module(interpreter), which calls _create_payload(), which tries to access sys.modules['__main__']._module_fqn, which is undefined. Right, we already knew that.

So why would ansible want to respawn this module at all? Looking at the code:

def _ensure_dnf(self):
  […]
  global dnf
  try:
    import dnf
    import dnf.cli
    import dnf.const
    import dnf.exceptions
    import dnf.subject
    import dnf.util
    HAS_DNF = True 
  except ImportError:
    HAS_DNF = False

  if HAS_DNF:
    return
  […]
  respawn_module(interpreter)

obviously importing dnf at this – delayed – point fails. But why? dnf was available before and working fine in earlier dnf: calls in that playbook. Since – at least for me – this error only occurred when doing package updates with ansible, probably some lib changed while mitogen/ansible was still running. Which was the case.

I can reproduce the import error easily:

  • in a shell run python3 to enter interactive mode
  • in another shell upgrade the glibc (dnf -y upgrade glibc)
  • return to the first shell where python is still running and type import dnf
  • see it fail with ImportError: /lib64/librt.so.1: undefined symbol: __pthread_attr_copy, version GLIBC_PRIVATE

Anyway. This helps to understand what triggers this error, but not how to fix it.

Looking at the ansible code we see that the only place _module_fqn and _modlib_path is defined is in init_globals which is passed to runpy by an Ansiballz template. So since mitogen is not using Ansiballz (or runpy) I think it's rather hard to fix this in mitogen. Even if we could inject module_fqn and _modlib_path in mitogen, the mechanism of respawning is built for Ansiballz.

In mitogen this error could be fixed by catching exceptions related to _module_fqn and restart the mitogen server process. Sounds spooky.

As meta: reset_connection also terminates the mitogen server on the managed host, this could be used to circumvent this error.

So while this fails at the third dnf call (assuming glibc gets updated):

tasks:
  - dnf: name=foobar state=absent
  - dnf: name=glibc state=latest
  - dnf: name=vim

this works (for me, ymmv):

tasks:
  - dnf: name=foobar state=absent
  - dnf: name=glibc state=latest
  - meta: reset_connection
  - dnf: name=vim

philfry avatar Mar 23 '23 12:03 philfry

I noticed that it happens only on first apt module invocation on fresh clean destination system.

I can confirm this.

azmeuk avatar Apr 04 '23 14:04 azmeuk

I confirm this bug still exists.

cocoonkid avatar Jun 28 '23 18:06 cocoonkid

Same bug with mitogen-0.3.7

adpavlov avatar Apr 09 '24 10:04 adpavlov