mitogen
mitogen copied to clipboard
Templating broken when constructing value for `ansible_ssh_common_args`
Hi, I'm on ansible-core-2.12.2 (thx for all the work in getting that done) and mitogen v0.3.2.
We have some basic jinja inside one of our vars files:
---
# Use the correct jump host
ansible_ssh_common_args: >-
-o ProxyCommand='ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -W %h:%p admin@{{ hostvars.jumphost.public_ip_address }}'
This causes errors:
TASK [Waiting for connection] *********************************************************************************************************************************************************
task path: /Users/dick.visser/git/deploy_dick/data/acc/site.yml:417
[WARNING]: Unhandled error in Python interpreter discovery for host acc_proxy1: EOF on stream; last 100 lines received: ssh: Could not resolve hostname {{: nodename nor servname
provided, or not known kex_exchange_identification: Connection closed by remote host
If I hardcode it like this:
---
# Use the correct jump host
ansible_ssh_common_args: >-
-o ProxyCommand='ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -W %h:%p [email protected]'
then things work....
The jinja inside the inventory works fine with ansible v3.4.0 (ansible-base 2.10.x)
Any thoughts?
I have the same issue with ansible core 2.11.8, mitogen 0.3.2
same issue here, Ansible Core 2.10.17, mitogen 0.3.2 & 0.3.1
Occured here too with Ansible 2.11.9 and mitogen 0.3.2 (well, master, more specifically) due to this fella https://github.com/kubernetes-sigs/kubespray/blob/master/roles/kubespray-defaults/defaults/main.yaml#L4
with a small test inventory, Ansible 2.10 and git bisect, I found commit c61c063b4f9b2b63dcaa86443631a268c9f72870 to be the reason for this bug. Unfortunately this commit is quite big, so I try to manually find the exact reason to maybe find a bugfix / workaround …
with a small test inventory, Ansible 2.10 and git bisect, I found commit c61c063 to be the reason for this bug. Unfortunately this commit is quite big, so I try to manually find the exact reason to maybe find a bugfix / workaround …
It may be ansible_mitogen/transport_config.py
It may be
ansible_mitogen/transport_config.py
You're right, any other change of this commit does not affect the outcome of my tests. But, it may also be that an internal change in Ansible that (also) causes this bug, but I'm not quite sure:
While trying to find the smallest partial revert of commit c61c063b4f9b2b63dcaa86443631a268c9f72870, I detected a difference in the result of my small ping test depending on the version of Ansible used.
Beginning from tag v0.3.2, after applying the diff at the end (which reverts the commit partially), running ansible -m ping host
with a small test inventory works for Ansible 2.10 as expected but stops working for Ansible 5.4.0 (core 2.12.3) with the same error message:
host | UNREACHABLE! => {
"changed": false,
"msg": "EOF on stream; last 100 lines received:\nssh: Could not resolve hostname {%: Name or service not known\r",
"unreachable": true
}
So partially reverting this change does work for older Ansible versions (~ 2.10) but not for newer ones (~ 5.4.0 / 2.12.3).
This is the diff from the mention above:
diff --git a/ansible_mitogen/transport_config.py b/ansible_mitogen/transport_config.py
index 4babbde3..344c3d84 100644
--- a/ansible_mitogen/transport_config.py
+++ b/ansible_mitogen/transport_config.py
@@ -467,9 +467,9 @@ class PlayContextSpec(Spec):
return [
mitogen.core.to_text(term)
for s in (
- C.config.get_config_value("ssh_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {})),
- C.config.get_config_value("ssh_common_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {})),
- C.config.get_config_value("ssh_extra_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {}))
+ getattr(self._play_context, 'ssh_args', ''),
+ getattr(self._play_context, 'ssh_common_args', ''),
+ getattr(self._play_context, 'ssh_extra_args', '')
)
for term in ansible.utils.shlex.shlex_split(s or '')
]
@@ -696,9 +696,22 @@ class MitogenViaSpec(Spec):
return [
mitogen.core.to_text(term)
for s in (
- C.config.get_config_value("ssh_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {})),
- C.config.get_config_value("ssh_common_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {})),
- C.config.get_config_value("ssh_extra_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("vars", {}))
+ (
+ self._host_vars.get('ansible_ssh_args') or
+ getattr(C, 'ANSIBLE_SSH_ARGS', None) or
+ os.environ.get('ANSIBLE_SSH_ARGS')
+ # TODO: ini entry. older versions.
+ ),
+ (
+ self._host_vars.get('ansible_ssh_common_args') or
+ os.environ.get('ANSIBLE_SSH_COMMON_ARGS')
+ # TODO: ini entry.
+ ),
+ (
+ self._host_vars.get('ansible_ssh_extra_args') or
+ os.environ.get('ANSIBLE_SSH_EXTRA_ARGS')
+ # TODO: ini entry.
+ ),
)
for term in ansible.utils.shlex.shlex_split(s)
if s
Encountering the same issue
ansible 2.10.17
mitogen-0.3.2
[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=3m -o ForwardAgent=yes
control_path = ~/.ssh/ansible-%%C
ansible_ssh_jumphost: "{{ hostvars[groups['jumphost_servers'][0]]['ansible_host'] }}"
ansible_ssh_common_args: '-o ProxyCommand="ssh -W %h:%p -q {{ ansible_ssh_user }}@{{ ansible_ssh_jumphost }}"'
kex_exchange_identification: Connection closed by remote host
@dnmvisser Have you found a working solution for this issue ? mitogen is a very useful part of our toolset, we'd love to hear if there's a way to make this work. Thank you.
@dnmvisser Have you foung a working solution for this issue ? mitogen is a very useful part of our toolset, we'd love to hear if there's a way to make this work. Thank you.
Nope, I ended up creating files with hard coded IP addresses etc
@dnmvisser Have you foung a working solution for this issue ? mitogen is a very useful part of our toolset, we'd love to hear if there's a way to make this work. Thank you.
Nope, I ended up creating files with hard coded IP addresses etc
We finally got this to work with this combination of versions:
ansible --version
ansible 2.10.17
mitogen-0.3.0rc1
I took the time to inspect further and found a difference in the calling of C.config.get_config_value
between Ansible and Mitogen.
For getting the configuration of ssh_common_args
, Mitogen calls:
https://github.com/mitogen-hq/mitogen/blob/89c0cc94d16218e2647bb8bb32b011231def0fd7/ansible_mitogen/transport_config.py#L478
Ansible plugins (here ssh) use a helper AnsiblePlugin.get_option
which does (if GitHub does not render the code, click on the links):
https://github.com/ansible/ansible/blob/b104478f171a4030c0cd96ef4d99db65bf04dceb/lib/ansible/plugins/connection/ssh.py#L743-L744
https://github.com/ansible/ansible/blob/b104478f171a4030c0cd96ef4d99db65bf04dceb/lib/ansible/plugins/init.py#L55-L62
Intercepting these calls to get_config_value
reveals, that the calls from the official ssh plugin sets the argument variables
to a dict containing all host variables already resolved (a.k.a. not in their template form after Jinja2). However Mitogen's connection plugin sets the argument to a dict containing the probably the task variables unresolved (a.k.a. in their template form before Jinja2).
Meaning in practice: Given these example host vars:
ansible_ssh_common_args: "{{ other_var }}"
other_var: "--my-option"
Then the argument variables
of get_config_value
looks like
-
{…, "ansible_ssh_common_args": "--my-option", …}
if called from Ansible's ssh plugin -
{…, "ansible_ssh_common_args": "{{ other var }}", …}
if called from Mitogen's connection plugin
I do not know Ansible's Python code good enough to fix this, probably by resolving the variables properly before passing them to get_config_value
, but maybe this helps someone else.
Any updates?
@moreati could you please look into this issue?
Any workarounds maybe? I use this with kubespray.
Same issues for while installing kubespray with mitogen 0.3.3.
After playing a little with the python script and the responsible file (thx @Zocker1999NET), I find a way to fix it. However, I didn't took the time yet to check whether the change can have side effects or generate issues, as my guess was hostvars if the view of vars for each host. Hope this is right!
The fix is to replace self._task_vars.get("vars", {})
with self._task_vars.get("hostvars", {}).get(self._inventory_name, {})
in PlayContextSpec, around lines 483 (method ssh_args).
Result looks like:
def ssh_args(self):
return [
mitogen.core.to_text(term)
for s in (
C.config.get_config_value("ssh_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("hostvars", {}).get(self._inventory_name, {})),
C.config.get_config_value("ssh_common_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("hostvars", {}).get(self._inventory_name, {})),
C.config.get_config_value("ssh_extra_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("hostvars", {}).get(self._inventory_name, {}))
)
for term in ansible.utils.shlex.shlex_split(s or '')
]
I won't be able to verify the fix until August, but if someone can play with it, let's share the result! Edit: meaning I was not able to run the plyaybool till the end to be sure it works, but it defintely fixes the blocking task.
@momiji I think this can be a valid fix for this issue. I applied the change to both ssh_args methods in mitogen/ansible_mitogen/transport_config.py
and ran a relatively huge Ansible repo I maintain in check mode and everything seemed fine. It could connect to all hosts expected even with templates in ansible_ssh_common_args and did not report any new diffs or errors. Can you create a PR with this patch so it might be reviewed?
I tested @momiji's patch (much appreciated!) with a simple test and a more complex real-world playbook today and everything is worked as expected.
We will probably be using this patched mitogen for our playbooks until an official fix comes out, so I'll report back here if we do run into any regressions or issues that may be related.
Hello,
Thanks for the patch @momiji
It works for bastion host with ansible_ssh_common_args
in template.
Unfortunately, after applied the patch in both ssh_args methods in mitogen/ansible_mitogen/transport_config.py
, it introduces another issue with ansible.posix.synchronize
module (ansible.posix collection 1.2.0). When using use_ssh_args: true
for rsync folder, template seems doesn't work for synchronize. https://docs.ansible.com/ansible/latest/collections/ansible/posix/synchronize_module.html
I have playbook tasks:
tasks:
- name: Sync scripts
ansible.posix.synchronize:
src: ../roles/my_server/files/opt/scripts/
dest: /opt/scripts/
recursive: true
use_ssh_args: true
archive: false
rsync_opts:
- '--chmod=0750'
- '-o'
- '-g'
- '--chown=root:mycustomgroup'
Playbook run error:
{
"rc": 255,
"cmd": "sshpass -d18 /usr/bin/rsync --delay-updates -F --compress --recursive --rsh=/usr/bin/ssh -S none -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ProxyCommand=\"ssh -W %h:%p {{ bastion_user }}@{{ bastion_hostname }} -i $BASTION_SSH_PRIVATE_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null\" --rsync-path=sudo rsync --chmod=0750 -o -g --chown=root:mycustomgroup --out-format=<<CHANGED>>%i %n%L /runner/project/roles/my_server/files/opt/scripts/ ansible@myserver:/opt/scripts/",
"msg": "ssh: Could not resolve hostname {{: Name or service not known\r\nkex_exchange_identification: Connection closed by remote host\r\nrsync: connection unexpectedly closed (0 bytes received so far) [sender]\nrsync error: unexplained error (code 255) at io.c(226) [sender=3.1.3]\n",
"invocation": {
"module_args": {
"src": "/runner/project/roles/my_server/files/opt/scripts/",
"dest": "ansible@myserver:/opt/scripts/",
"recursive": true,
"archive": false,
"rsync_opts": [
"--chmod=0750",
"-o",
"-g",
"--chown=root:mycustomgroup"
],
"_local_rsync_path": "rsync",
"_local_rsync_password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
"private_key": null,
"rsync_path": "sudo rsync",
"ssh_args": "-o ProxyCommand=\"ssh -W %h:%p {{ bastion_user }}@{{ bastion_hostname }} -i $BASTION_SSH_PRIVATE_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null\"",
"delete": false,
"_substitute_controller": false,
"checksum": false,
"compress": true,
"existing_only": false,
"dirs": false,
"copy_links": false,
"set_remote_user": true,
"rsync_timeout": 0,
"ssh_connection_multiplexing": false,
"partial": false,
"verify_host": false,
"mode": "push",
"dest_port": null,
"links": null,
"perms": null,
"times": null,
"owner": null,
"group": null,
"link_dest": null
}
},
"_ansible_no_log": false,
"changed": false
}
Same issues for while installing kubespray with mitogen 0.3.3.
After playing a little with the python script and the responsible file (thx @Zocker1999NET), I find a way to fix it. However, I didn't took the time yet to check whether the change can have side effects or generate issues, as my guess was hostvars if the view of vars for each host. Hope this is right!
The fix is to replace
self._task_vars.get("vars", {})
withself._task_vars.get("hostvars", {}).get(self._inventory_name, {})
in PlayContextSpec, around lines 483 (method ssh_args).Result looks like:
def ssh_args(self): return [ mitogen.core.to_text(term) for s in ( C.config.get_config_value("ssh_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("hostvars", {}).get(self._inventory_name, {})), C.config.get_config_value("ssh_common_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("hostvars", {}).get(self._inventory_name, {})), C.config.get_config_value("ssh_extra_args", plugin_type="connection", plugin_name="ssh", variables=self._task_vars.get("hostvars", {}).get(self._inventory_name, {})) ) for term in ansible.utils.shlex.shlex_split(s or '') ]
I won't be able to verify the fix until August, but if someone can play with it, let's share the result! Edit: meaning I was not able to run the plyaybool till the end to be sure it works, but it defintely fixes the blocking task.
I tried this out with ansible 5.10.0 and mitogen-0.3.3 but for me this does not work. I still get the same error (Could not resolve hostname {{:
)
Hi all, after reading your remarks, I've been able to make some tests on my side:
- applied patch to both ssh_args instead of only to the first one, as suggested
- run kubespray 1.18.1 installation (using ansible 5.4.0) without any issues
I don't think it's using the bastion feature, so I can't help on the remaining issues.
Hi, also achieved migration to kubespray 1.19.0 (using ansible 5.7.1) with no issues. Next is testing with another playbook (with a higher version of ansible), and if it goes well I'll prepare a PR for this.
Hello, PR #956 sent.
Hello,
Thanks for the patch @momiji It works for bastion host with
ansible_ssh_common_args
in template. Unfortunately, after applied the patch in both ssh_args methods inmitogen/ansible_mitogen/transport_config.py
, it introduces another issue withansible.posix.synchronize
module (ansible.posix collection 1.2.0). When usinguse_ssh_args: true
for rsync folder, template seems doesn't work for synchronize. https://docs.ansible.com/ansible/latest/collections/ansible/posix/synchronize_module.htmlI have playbook tasks:
tasks: - name: Sync scripts ansible.posix.synchronize: src: ../roles/my_server/files/opt/scripts/ dest: /opt/scripts/ recursive: true use_ssh_args: true archive: false rsync_opts: - '--chmod=0750' - '-o' - '-g' - '--chown=root:mycustomgroup'
Playbook run error:
{ "rc": 255, "cmd": "sshpass -d18 /usr/bin/rsync --delay-updates -F --compress --recursive --rsh=/usr/bin/ssh -S none -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ProxyCommand=\"ssh -W %h:%p {{ bastion_user }}@{{ bastion_hostname }} -i $BASTION_SSH_PRIVATE_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null\" --rsync-path=sudo rsync --chmod=0750 -o -g --chown=root:mycustomgroup --out-format=<<CHANGED>>%i %n%L /runner/project/roles/my_server/files/opt/scripts/ ansible@myserver:/opt/scripts/", "msg": "ssh: Could not resolve hostname {{: Name or service not known\r\nkex_exchange_identification: Connection closed by remote host\r\nrsync: connection unexpectedly closed (0 bytes received so far) [sender]\nrsync error: unexplained error (code 255) at io.c(226) [sender=3.1.3]\n", "invocation": { "module_args": { "src": "/runner/project/roles/my_server/files/opt/scripts/", "dest": "ansible@myserver:/opt/scripts/", "recursive": true, "archive": false, "rsync_opts": [ "--chmod=0750", "-o", "-g", "--chown=root:mycustomgroup" ], "_local_rsync_path": "rsync", "_local_rsync_password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER", "private_key": null, "rsync_path": "sudo rsync", "ssh_args": "-o ProxyCommand=\"ssh -W %h:%p {{ bastion_user }}@{{ bastion_hostname }} -i $BASTION_SSH_PRIVATE_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null\"", "delete": false, "_substitute_controller": false, "checksum": false, "compress": true, "existing_only": false, "dirs": false, "copy_links": false, "set_remote_user": true, "rsync_timeout": 0, "ssh_connection_multiplexing": false, "partial": false, "verify_host": false, "mode": "push", "dest_port": null, "links": null, "perms": null, "times": null, "owner": null, "group": null, "link_dest": null } }, "_ansible_no_log": false, "changed": false }
Just to update that
- with ansible.posix 1.4.0
- applying patch https://github.com/mitogen-hq/mitogen/pull/956 on latest commit https://github.com/mitogen-hq/mitogen/commit/572636a9d3c5a4ac4e8591c42f29763cb56fe602
No more error ansible.posix.synchronize
above
Any way to get this merged? I also need to apply the patch to get my setup working...
I've tried once again today and still no luck with:
- ansible v5 (several combination tried using ansible-core-2.12.x)
- any mitogen
- jinja code in
ansible_common_ssh_args
If there are people out there that do have a working setup using the above, please post which versions you use:
-
ansible --version
-
ansible-galaxy collection list
- which mitogen version/commit, and what patch on top of that
If anyone is interested, we use a shell wrapper for ansible-playbook
, which allowed me to work around this issue slightly more elegant than just hardcoding the jumphost IP. We fetch the IP first using aws cli
and then use that in the ssh-common-args
command line argument:
#
export JUMP_IP=$(aws ec2 describe-instances \
--region ${AWS_DEFAULT_REGION} \
--filters "Name=tag:Name,Values=jumphost" \
--query 'Reservations[0].Instances[]' | \
jq -r 'sort_by(.LaunchTime)|reverse|.[0].PublicIpAddress' )
#
ansible-playbook \
--ssh-common-args="-o ProxyCommand='ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -W %h:%p admin@${JUMP_IP}'" \
playbook.yml
I see something very similar with templates used in ansible_host
, ansible-core 2.4.14, mitogen master branch
- hosts: rproxy0
vars:
subnet: "10.10.10.0"
ansible_host: "{{ subnet | ansible.utils.ipmath(100) }}"
tasks:
- debug: var=ansible_host
PLAY [rproxy0] ********************************************************************************************************************************************************************************************
TASK [Gathering Facts] ************************************************************************************************************************************************************************************
Tuesday 11 April 2023 13:44:10 +0100 (0:00:00.150) 0:00:00.150 *********
fatal: [rproxy0]: UNREACHABLE! => {"changed": false, "msg": "EOF on stream; last 100 lines received:\nssh: Could not resolve hostname {{ subnet | ansible.utils.ipmath(100) }}: nodename nor servname provided, or not known\r", "unreachable": true}
I experience the a similar issue with a Jinja2 expression in ansible_ssh_user
related to or duplicate of #599
With the time to upgrade our ansible codebase, it seems we're still blocked by anything above mitogen-0.3.0rc1
.
When trying:
ansible==5.7.1
ansible-core==2.12.5
and
mitogen-0.3.2 or
mitogen-0.3.3 or
mitogen-0.3.4
Result:
EOF on stream; last 100 lines received:
kex_exchange_identification: Connection closed by remote host
I tried applying the patch suggested above, to ansible_mitogen/transport_config.py
, but, no go.
If anyone sub'ed here has found a way forward, please do share.