mitogen
mitogen copied to clipboard
Ansible raw fails randomly
Quack,
I'm using master commit 876a82f and Ansible 2.5.5.
This task:
- name: Install Python
raw: test -e /usr/bin/python || (apt -qqy update && apt install -qqy python-minimal)
register: output
changed_when: output.stdout != ""
sometimes, randomly, fails with:
11280 1529065165.21643: _execute() done
11280 1529065165.21658: dumping result to json
11280 1529065165.21672: done dumping result, returning
11280 1529065165.21751: done running TaskExecutor() for Jinta/TASK: Install Python [5c514f2f-680e-8e01-2ae9-00000000000d]
11280 1529065165.21782: sending task result for task 5c514f2f-680e-8e01-2ae9-00000000000d
11280 1529065165.21838: done sending task result for task 5c514f2f-680e-8e01-2ae9-00000000000d
11280 1529065165.21851: WORKER PROCESS EXITING
fatal: [Jinta]: UNREACHABLE! => {"changed": false, "msg": "Connection timed out.", "unreachable": true}
Regards. \_o<
I've had the same issue with this.
Thanks for reporting! It's almost certainly to do with the new async tasks implementation landed last week -- looking at this now
Hi there,
Can you both please provide a little more info:
- The output of
-vvvup to the point of a hang (please attach as a separate file :) ) - Number of targets in the run and the target OS
- OS of the host machine
Presumably this raw task appears at the start of the run? Note that presently Mitogen has no 'escape hatch' for the raw: action, it is expected that Python is always available on the target machines. I have a TODO to fix this, it's quite straightforward just not done yet.
Presumably this raw task appears at the start of the run?
Yes, I'm using it to install python in my case.
I'll get back with the -vvv results.
used a default ansible's strategy 'linear' (it have at changelog)
Note the raw module currently requires Python to be installed. Fixing that is basically blocked on #419 to avoid any more spaghetti code.
But that doesn't seem to explain your situation. What kind of machine is it connecting to and how many machines are in a typical run?
We had some super scary bugs fixed over the past 6 months -- including several where FDs could be closed at random, this might explain it. At one point bootstrap could fail if the machine was low on RAM /and/ user was SSHing into an unprivileged container.
Another avenue is a difference in how Mitogen interprets 'connected' compared to Ansible -- it requires everything to happen up to and including the remote interpreter saying hello before it is marked connected. Is there some chance that the 'python' command is failing on that machine, and somehow the SSH connection is otherwise being held open? For example, a crazy bashrc that backgrounds some process might do this, as could an SSH config with certain ControlMaster settings, where exitting cannot complete because other ControlMaster clients are using the connection
I'm going to have a play with raw specifically, but just in case you can tickle it again, this strace wrapper technique is very effective at revealing startup problems that aren't so easy to capture in a log.
I can confirm this is an issue when Mitogen is installed on both Ubuntu 16.04 and Ubuntu 18.04 and running tasks on both Ubuntu 16.04 and 18.04 where they are fresh deployments (Digital Ocean droplets for example).
- hosts: all
gather_facts: False
tasks:
- name: install python 2
raw: test -e /usr/bin/python || (apt -y update && apt install -y python-minimal)
Simply does not work currently with latest Mitogen release (I'm using Ansible 2.7.2).
Commenting Mitogen on the ansible.cfg file runs the task fine
@dsgnr I think that's a separate issue to ducks'.. Mitogen does not have any raw support right now (it's coming soon)
You can achieve the same effect (if convenient) by switching that one task to the linear strategy within the same file. See the third bullet point on https://mitogen.readthedocs.io/en/latest/changelog.html#known-issues
Thanks for that @dw. I hadn’t spotted that as a known issue!
Raw is a real easy feature to do, but it basically requires cutting a new dev branch, that's why it hasn't been done yet. I'm trying to find a new way to do all this in 0.2 without making a mess, might be better for long term stability than risking a dev branch where all hell breaks loose ;)
still now news on this ?
It is very unlikely to be implemented. It cannot be handled by mitogen. Install python by other means. (i.e. cloud-init, clobber, etc). On most distros Python is already in minimal installation, so you do not even need that. I am pretty sure Python 3 is installed on Ubuntu by default, so you can just skip that.
There is also workaround already mentioned (change strategy to linear for a specific task), so ...
the problem is in air gapped environments, the ansible hosts i am controlling have no internet access to python cant be easily installed
is there any simple workaround to patch any mitogen module and make that work ? i guess not right ?
I had to let go of Mitogen at some point but I'm still interested. I certainly won't bother you if you prefer to close this ticket but I think in this case the documentation should make this limitation clear (with the workaround).
That said I'm still using the snippet above, updated for Python 3. I remember that with CentOS/RHEL having a system-python separate from the usual python package I wanted to make sure I got the latter. Also if you have no access to VM preseeding (to install the package at this stage), that can be useful; you're not always in charge of all the pieces of infra.
also: how can strategy be switched to linear for just a single task ? does it require a new play ?
@dberardo-com Check the second point here https://mitogen.networkgenomics.com/ansible_detailed.html#noteworthy-differences
I never tried it myself tho.
@dberardo-com I see, yeah, I am not sure if this can be applied to individual task.
Ansible strategy can only be changed per play. Making it per task isn't something Mitogen could do.