kubespray icon indicating copy to clipboard operation
kubespray copied to clipboard

kubespray failes on weave preperation

Open Talangor opened this issue 3 years ago • 6 comments

kubespray is failing with following error

fatal: [node2]: FAILED! => {"attempts": 4, "changed": true, "cmd": ["/usr/local/bin/nerdctl", "-n", "k8s.io", "pull", "--quiet", "docker.io/weaveworks/weave-npc:2.8.1"], "delta": "0:00:07.330437", "end": "2022-05-27 14:59:01.814763", "msg": "non-zero return code", "rc": 1, "start": "2022-05-27 14:58:54.484326", "stderr": "time=\"2022-05-27T14:59:01Z\" level=fatal msg=\"failed to prepare extraction snapshot \\\"extract-755736618-ZU8W sha256:fba37c1f6ac535cb64fc10ede23243
4d735c7cf1e61bb89d7b1db79c56c28c91\\\": copying of parent failed: unsupported mode prw-r--r--: %!w(<nil>): unknown\"", "stderr_lines": ["time=\"2022-05-27T14:59:01Z\" level=fatal msg=\"failed to prepare extraction snapshot \\\"extract-755736618-ZU8W sha256:fba37c1f6ac535cb64fc10ede232434d735c7cf1e61bb89d7b1db79c56c28c91\\\": copying of parent failed: unsupported mode prw-r--r--: %!w(<nil>): unknown\""], "stdout": "", "stdout_lines": []}

Environment:

  • Cloud provider or hardware configuration:

  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"): Linux 5.4.0-66-generic x86_64 NAME="Ubuntu" VERSION="20.04.3 LTS (Focal Fossa)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 20.04.3 LTS" VERSION_ID="20.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=focal UBUNTU_CODENAME=focal

  • Version of Ansible (ansible --version): ansible [core 2.12.5] config file = /home/ubuntu/kubespray-v2.18.1/ansible.cfg configured module search path = ['/home/ubuntu/kubespray-v2.18.1/library'] ansible python module location = /usr/local/lib/python3.8/dist-packages/ansible ansible collection location = /home/ubuntu/.ansible/collections:/usr/share/ansible/collections executable location = /usr/local/bin/ansible python version = 3.8.10 (default, Mar 15 2022, 12:22:08) [GCC 9.4.0] jinja version = 2.11.3 libyaml = True

  • Version of Python (python --version): Python 3.8.10

Kubespray version (commit) (git rev-parse --short HEAD): 73fc70db

Network plugin used: weave

Full inventory with variables (ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"):

https://gist.github.com/Talangor/34ae3e45e8bb55a52e3b500310815043#file-inv-log Command used to invoke ansible: ansible-playbook -i inventory/pre-production/hosts.yaml --become -u sadmin -K cluster.yml

Output of ansible run:

https://gist.github.com/Talangor/34ae3e45e8bb55a52e3b500310815043#file-ansible-log

Anything else do we need to know:

Talangor avatar May 27 '22 15:05 Talangor

This looks like an issue with nerdctl failing to write a file to the local filesystem, are you using an exotic filesystem on your ansible host?

cristicalin avatar May 29 '22 19:05 cristicalin

Same issue here. No exotic file system, just a software RAID 1.

node1 ~ # lsblk -f
NAME    FSTYPE          LABEL    UUID                                 MOUNTPOINT
sda                                                                   
├─sda1  linux_raid_memb rescue:0 55c5d9d2-5e39-6b25-1080-12b97a0e8740 
│ └─md0 swap                     77ce6d25-e2e6-482f-ab41-eb00317ebb49 
├─sda2  linux_raid_memb rescue:1 19cd9761-c4ca-aec1-0072-992b4fa10f79 
│ └─md1 ext3                     8dbecd76-1824-4f77-8630-f613ab0f9372 /boot
└─sda3  linux_raid_memb rescue:2 861de6b2-13ee-c983-4b27-a0b07aecf772 
  └─md2 ext4                     d0f3cabb-a808-403b-bdba-d86485f6b6bc /
sdb                                                                   
├─sdb1  linux_raid_memb rescue:0 55c5d9d2-5e39-6b25-1080-12b97a0e8740 
│ └─md0 swap                     77ce6d25-e2e6-482f-ab41-eb00317ebb49 
├─sdb2  linux_raid_memb rescue:1 19cd9761-c4ca-aec1-0072-992b4fa10f79 
│ └─md1 ext3                     8dbecd76-1824-4f77-8630-f613ab0f9372 /boot
└─sdb3  linux_raid_memb rescue:2 861de6b2-13ee-c983-4b27-a0b07aecf772 
  └─md2 ext4                     d0f3cabb-a808-403b-bdba-d86485f6b6bc /

Citrullin avatar Jun 09 '22 06:06 Citrullin

@Talangor Since you haven't replied in a while, how did you get around this?

Citrullin avatar Jun 09 '22 07:06 Citrullin

Opened a issue on the nerdctl repo.

Thanks, just saw this one on the CI, will keep a look on this issue.

floryut avatar Jun 14 '22 11:06 floryut

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Sep 12 '22 12:09 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Oct 12 '22 12:10 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Nov 11 '22 12:11 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Nov 11 '22 12:11 k8s-ci-robot