image-builder
image-builder copied to clipboard
Building Flatcar SIG images on Azure with OpenSSH 9.0 fails
What steps did you take and what happened:
Running FLATCAR_VERSION=current make build-azure-sig-flatcar
on version a09b089b1b344d75275cf741cafdbd877050f660 currently fails with the following error:
sig-flatcar: Setting up proxy adapter for Ansible....
==> sig-flatcar: Executing Ansible: ansible-playbook -e packer_build_name="sig-flatcar" -e packer_builder_type=azure-arm --ssh-extra-args '-o IdentitiesOnly=yes' --extra-vars containerd_url=https://github.com/containerd/containerd/releases/download/v1.6.1/cri-containerd-cni-1.6.1-linux-amd64.tar.gz containerd_sha256=e01da1ad4a41a71e0fef52b1f0ed08980b808f1d7c904c9956c24afb8236d6f0 pause_image=k8s.gcr.io/pause:3.6 containerd_additional_settings= containerd_cri_socket=/var/run/containerd/containerd.sock containerd_version=1.6.1 crictl_url=https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.23.0/crictl-v1.23.0-linux-amd64.tar.gz crictl_sha256=b754f83c80acdc75f93aba191ff269da6be45d0fc2d3f4079704e7d1424f1ca8 crictl_source_type=http custom_role= custom_role_names="" disable_public_repos=false extra_debs= extra_repos= extra_rpms= http_proxy= https_proxy= kubeadm_template=etc/kubeadm.yml kubernetes_cni_http_source=https://github.com/containernetworking/plugins/releases/download kubernetes_cni_http_checksum=sha256:https://storage.googleapis.com/k8s-artifacts-cni/release/v0.8.7/cni-plugins-linux-amd64-v0.8.7.tgz.sha256 kubernetes_http_source=https://dl.k8s.io/release kubernetes_container_registry=k8s.gcr.io kubernetes_rpm_repo=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 kubernetes_rpm_gpg_key="https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg" kubernetes_rpm_gpg_check=True kubernetes_deb_repo="https://apt.kubernetes.io/ kubernetes-xenial" kubernetes_deb_gpg_key=https://packages.cloud.google.com/apt/doc/apt-key.gpg kubernetes_cni_deb_version=0.8.7-00 kubernetes_cni_rpm_version=0.8.7-0 kubernetes_cni_semver=v0.8.7 kubernetes_cni_source_type=http kubernetes_semver=v1.21.10 kubernetes_source_type=http kubernetes_load_additional_imgs=false kubernetes_deb_version=1.21.10-00 kubernetes_rpm_version=1.21.10-0 no_proxy= pip_conf_file= python_path=/opt/pypy/site-packages redhat_epel_rpm=https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm epel_rpm_gpg_key= reenable_public_repos=true remove_extra_repos=false systemd_prefix=/etc/systemd sysusr_prefix=/opt sysusrlocal_prefix=/opt load_additional_components=false additional_registry_images=false additional_registry_images_list= additional_url_images=false additional_url_images_list= additional_executables=false additional_executables_list= additional_executables_destination_path= build_target=virt --extra-vars ansible_python_interpreter=/opt/pypy/bin/pypy --extra-vars -e ansible_ssh_private_key_file=/tmp/ansible-key1464663093 -i /tmp/packer-provisioner-ansible1359111898 /home/invidian/data/workspaces/clusterapi-flatcar/image-builder/images/capi/ansible/node.yml
sig-flatcar:
sig-flatcar: PLAY [all] *********************************************************************
sig-flatcar:
sig-flatcar: TASK [Gathering Facts] *********************************************************
sig-flatcar: fatal: [default]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via scp: Warning: Permanently added '[127.0.0.1]:40711' (RSA) to the list of known hosts.\r\nbash: line 1: /usr/lib/sftp-server: No such file or directory\nscp: Connection closed\r\n", "unreachable": true}
sig-flatcar:
sig-flatcar: PLAY RECAP *********************************************************************
sig-flatcar: default : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0
<omitted>
Build 'sig-flatcar' errored after 5 minutes 873 milliseconds: Error executing Ansible: Non-zero exit status: exit status 4
What did you expect to happen:
Build do succeed.
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]
Environment:
Project (Image Builder for Cluster API):
Additional info for Image Builder for Cluster API related issues:
- OS (e.g. from
/etc/os-release
, orcmd /c ver
): Arch Linux - Packer Version: 1.8.0
- Packer Provider:
- Ansible Version: core 2.11.5
- Cluster-api version (if using):
- Kubernetes version: (use
kubectl version
):
/kind bug [One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]
Ooh, wait. It might be an issue with my local machine, I don't have sftp-server
binary either with latest OpenSSH update to version 9.0p1-1
...
So downgrading my local openssh to 8.9p1-1
makes it work, I'll keep investigating
Based on: https://github.com/hashicorp/packer/issues/11783#issuecomment-1137052770
Replacing:
"extra_arguments": [
"--extra-vars",
"{{user `ansible_common_vars`}}",
"--extra-vars",
"{{user `ansible_extra_vars`}}"
],
with
"extra_arguments": [
"--scp-extra-args", "'-O'",
"--extra-vars",
"{{user `ansible_common_vars`}}",
"--extra-vars",
"{{user `ansible_extra_vars`}}"
],
did the trick for me.
So I think the root cause lies in the Ansible provisioner for Packer: https://github.com/hashicorp/packer-plugin-ansible/issues/100.
As a workaround, we could try disabling the proxy for provisioner, but it may break some other scenarios I guess. Or use the workaround proposed by @kopiczko above.
hello guys,
did you managed to fix this or find a solution? I'm facing the same issue with vmware vpshere templates
vsphere-clone.MGlobal: Setting up proxy adapter for Ansible.... ==> vsphere-clone.MGlobal: Executing Ansible: ansible-playbook -e packer_build_name="MGlobal" -e packer_builder_type=vsphere-clone -e packer_http_addr=192.168.100.253:0 --ssh-extra-args '-o IdentitiesOnly=yes' -v -e ansible_ssh_private_key_file=/tmp/ansible-key3249663531 -i /tmp/packer-provisioner-ansible3844623631 /home/gitlab-runner/builds/NrNECaSf/0/manuh/vmug-demo-packer/default-config.yml vsphere-clone.MGlobal: Using /etc/ansible/ansible.cfg as config file vsphere-clone.MGlobal: vsphere-clone.MGlobal: PLAY [all] ********************************************************************* vsphere-clone.MGlobal: vsphere-clone.MGlobal: TASK [Create groups] *********************************************************** vsphere-clone.MGlobal: failed: [default] (item={'name': 'local'}) => {"ansible_loop_var": "item", "item": {"name": "local"}, "msg": "Failed to connect to the host via scp: bash: /usr/lib/sftp-server: No such file or directory\nscp: Connection closed\r\n", "unreachable": true} vsphere-clone.MGlobal: failed: [default] (item={'name': 'admins'}) => {"ansible_loop_var": "item", "item": {"name": "admins"}, "msg": "Failed to connect to the host via scp: bash: /usr/lib/sftp-server: No such file or directory\nscp: Connection closed\r\n", "unreachable": true} vsphere-clone.MGlobal: fatal: [default]: UNREACHABLE! => {"changed": false, "msg": "All items completed", "results": [{"ansible_loop_var": "item", "item": {"name": "local"}, "msg": "Failed to connect to the host via scp: bash: /usr/lib/sftp-server: No such file or directory\nscp: Connection closed\r\n", "unreachable": true}, {"ansible_loop_var": "item", "item": {"name": "admins"}, "msg": "Failed to connect to the host via scp: bash: /usr/lib/sftp-server: No such file or directory\nscp: Connection closed\r\n", "unreachable": true}]} vsphere-clone.MGlobal: vsphere-clone.MGlobal: PLAY RECAP ********************************************************************* vsphere-clone.MGlobal: default : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0
@manuh-L give a try changes from https://github.com/kubernetes-sigs/image-builder/pull/907, we will appreciate feedback :)
It looks like the newest scp
versions use SFTP under the hood:
-O Use the legacy SCP protocol for file transfers instead of the
SFTP protocol. Forcing the use of the SCP protocol may be
necessary for servers that do not implement SFTP, for
backwards-compatibility for particular filename wildcard
patterns and for expanding paths with a ‘~’ prefix for older
SFTP servers.
It would be nice to get SFTP to work with Flatcar instead. I think that would be the ultimate solution for this issue.
Thanks
@manuh-L give a try changes from #907, we will appreciate feedback :)
Thanks, I had demo to present, so for me the quick fix at the time was downgrade. I'll try asap
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale