image-builder icon indicating copy to clipboard operation
image-builder copied to clipboard

Building Flatcar SIG images on Azure with OpenSSH 9.0 fails

Open invidian opened this issue 2 years ago • 8 comments

What steps did you take and what happened:

Running FLATCAR_VERSION=current make build-azure-sig-flatcar on version a09b089b1b344d75275cf741cafdbd877050f660 currently fails with the following error:

    sig-flatcar: Setting up proxy adapter for Ansible....
==> sig-flatcar: Executing Ansible: ansible-playbook -e packer_build_name="sig-flatcar" -e packer_builder_type=azure-arm --ssh-extra-args '-o IdentitiesOnly=yes' --extra-vars containerd_url=https://github.com/containerd/containerd/releases/download/v1.6.1/cri-containerd-cni-1.6.1-linux-amd64.tar.gz containerd_sha256=e01da1ad4a41a71e0fef52b1f0ed08980b808f1d7c904c9956c24afb8236d6f0 pause_image=k8s.gcr.io/pause:3.6 containerd_additional_settings= containerd_cri_socket=/var/run/containerd/containerd.sock containerd_version=1.6.1 crictl_url=https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.23.0/crictl-v1.23.0-linux-amd64.tar.gz crictl_sha256=b754f83c80acdc75f93aba191ff269da6be45d0fc2d3f4079704e7d1424f1ca8 crictl_source_type=http custom_role= custom_role_names="" disable_public_repos=false extra_debs= extra_repos= extra_rpms= http_proxy= https_proxy= kubeadm_template=etc/kubeadm.yml kubernetes_cni_http_source=https://github.com/containernetworking/plugins/releases/download kubernetes_cni_http_checksum=sha256:https://storage.googleapis.com/k8s-artifacts-cni/release/v0.8.7/cni-plugins-linux-amd64-v0.8.7.tgz.sha256 kubernetes_http_source=https://dl.k8s.io/release kubernetes_container_registry=k8s.gcr.io kubernetes_rpm_repo=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 kubernetes_rpm_gpg_key="https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg" kubernetes_rpm_gpg_check=True kubernetes_deb_repo="https://apt.kubernetes.io/ kubernetes-xenial" kubernetes_deb_gpg_key=https://packages.cloud.google.com/apt/doc/apt-key.gpg kubernetes_cni_deb_version=0.8.7-00 kubernetes_cni_rpm_version=0.8.7-0 kubernetes_cni_semver=v0.8.7 kubernetes_cni_source_type=http kubernetes_semver=v1.21.10 kubernetes_source_type=http kubernetes_load_additional_imgs=false kubernetes_deb_version=1.21.10-00 kubernetes_rpm_version=1.21.10-0 no_proxy= pip_conf_file= python_path=/opt/pypy/site-packages redhat_epel_rpm=https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm epel_rpm_gpg_key= reenable_public_repos=true remove_extra_repos=false systemd_prefix=/etc/systemd sysusr_prefix=/opt sysusrlocal_prefix=/opt load_additional_components=false additional_registry_images=false additional_registry_images_list= additional_url_images=false additional_url_images_list= additional_executables=false additional_executables_list= additional_executables_destination_path= build_target=virt --extra-vars ansible_python_interpreter=/opt/pypy/bin/pypy --extra-vars  -e ansible_ssh_private_key_file=/tmp/ansible-key1464663093 -i /tmp/packer-provisioner-ansible1359111898 /home/invidian/data/workspaces/clusterapi-flatcar/image-builder/images/capi/ansible/node.yml
    sig-flatcar:
    sig-flatcar: PLAY [all] *********************************************************************
    sig-flatcar:
    sig-flatcar: TASK [Gathering Facts] *********************************************************
    sig-flatcar: fatal: [default]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via scp: Warning: Permanently added '[127.0.0.1]:40711' (RSA) to the list of known hosts.\r\nbash: line 1: /usr/lib/sftp-server: No such file or directory\nscp: Connection closed\r\n", "unreachable": true}
    sig-flatcar:
    sig-flatcar: PLAY RECAP *********************************************************************
    sig-flatcar: default                    : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0
<omitted>
Build 'sig-flatcar' errored after 5 minutes 873 milliseconds: Error executing Ansible: Non-zero exit status: exit status 4

What did you expect to happen:

Build do succeed.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

Project (Image Builder for Cluster API):

Additional info for Image Builder for Cluster API related issues:

  • OS (e.g. from /etc/os-release, or cmd /c ver): Arch Linux
  • Packer Version: 1.8.0
  • Packer Provider:
  • Ansible Version: core 2.11.5
  • Cluster-api version (if using):
  • Kubernetes version: (use kubectl version):

/kind bug [One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

invidian avatar Apr 13 '22 16:04 invidian

Ooh, wait. It might be an issue with my local machine, I don't have sftp-server binary either with latest OpenSSH update to version 9.0p1-1...

invidian avatar Apr 13 '22 17:04 invidian

So downgrading my local openssh to 8.9p1-1 makes it work, I'll keep investigating

invidian avatar Apr 14 '22 08:04 invidian

Based on: https://github.com/hashicorp/packer/issues/11783#issuecomment-1137052770

Replacing:

      "extra_arguments": [
        "--extra-vars",
        "{{user `ansible_common_vars`}}",
        "--extra-vars",
        "{{user `ansible_extra_vars`}}"
      ],

with

      "extra_arguments": [
        "--scp-extra-args", "'-O'",
        "--extra-vars",
        "{{user `ansible_common_vars`}}",
        "--extra-vars",
        "{{user `ansible_extra_vars`}}"
      ],

did the trick for me.

kopiczko avatar May 27 '22 18:05 kopiczko

So I think the root cause lies in the Ansible provisioner for Packer: https://github.com/hashicorp/packer-plugin-ansible/issues/100.

As a workaround, we could try disabling the proxy for provisioner, but it may break some other scenarios I guess. Or use the workaround proposed by @kopiczko above.

invidian avatar May 30 '22 16:05 invidian

hello guys,

did you managed to fix this or find a solution? I'm facing the same issue with vmware vpshere templates

vsphere-clone.MGlobal: Setting up proxy adapter for Ansible.... ==> vsphere-clone.MGlobal: Executing Ansible: ansible-playbook -e packer_build_name="MGlobal" -e packer_builder_type=vsphere-clone -e packer_http_addr=192.168.100.253:0 --ssh-extra-args '-o IdentitiesOnly=yes' -v -e ansible_ssh_private_key_file=/tmp/ansible-key3249663531 -i /tmp/packer-provisioner-ansible3844623631 /home/gitlab-runner/builds/NrNECaSf/0/manuh/vmug-demo-packer/default-config.yml vsphere-clone.MGlobal: Using /etc/ansible/ansible.cfg as config file vsphere-clone.MGlobal: vsphere-clone.MGlobal: PLAY [all] ********************************************************************* vsphere-clone.MGlobal: vsphere-clone.MGlobal: TASK [Create groups] *********************************************************** vsphere-clone.MGlobal: failed: [default] (item={'name': 'local'}) => {"ansible_loop_var": "item", "item": {"name": "local"}, "msg": "Failed to connect to the host via scp: bash: /usr/lib/sftp-server: No such file or directory\nscp: Connection closed\r\n", "unreachable": true} vsphere-clone.MGlobal: failed: [default] (item={'name': 'admins'}) => {"ansible_loop_var": "item", "item": {"name": "admins"}, "msg": "Failed to connect to the host via scp: bash: /usr/lib/sftp-server: No such file or directory\nscp: Connection closed\r\n", "unreachable": true} vsphere-clone.MGlobal: fatal: [default]: UNREACHABLE! => {"changed": false, "msg": "All items completed", "results": [{"ansible_loop_var": "item", "item": {"name": "local"}, "msg": "Failed to connect to the host via scp: bash: /usr/lib/sftp-server: No such file or directory\nscp: Connection closed\r\n", "unreachable": true}, {"ansible_loop_var": "item", "item": {"name": "admins"}, "msg": "Failed to connect to the host via scp: bash: /usr/lib/sftp-server: No such file or directory\nscp: Connection closed\r\n", "unreachable": true}]} vsphere-clone.MGlobal: vsphere-clone.MGlobal: PLAY RECAP ********************************************************************* vsphere-clone.MGlobal: default : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0

manuh-L avatar Jul 11 '22 06:07 manuh-L

@manuh-L give a try changes from https://github.com/kubernetes-sigs/image-builder/pull/907, we will appreciate feedback :)

invidian avatar Jul 11 '22 08:07 invidian

It looks like the newest scp versions use SFTP under the hood:

        -O Use the legacy SCP protocol for file transfers instead of the
           SFTP protocol. Forcing the use of the SCP protocol may be
           necessary for servers that do not implement SFTP, for
           backwards-compatibility for particular filename wildcard
           patterns and for expanding paths with a ‘~’ prefix for older
           SFTP servers.

It would be nice to get SFTP to work with Flatcar instead. I think that would be the ultimate solution for this issue.

kopiczko avatar Jul 13 '22 18:07 kopiczko

Thanks

@manuh-L give a try changes from #907, we will appreciate feedback :)

Thanks, I had demo to present, so for me the quick fix at the time was downgrade. I'll try asap

manuh-L avatar Jul 18 '22 19:07 manuh-L

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Oct 16 '22 20:10 k8s-triage-robot

/remove-lifecycle stale

invidian avatar Oct 16 '22 22:10 invidian