kind icon indicating copy to clipboard operation
kind copied to clipboard

Control Plane Fails to Start on Fedora 34 Silverblue

Open adambkaplan opened this issue 4 years ago • 10 comments

What happened:

When using the podman provider, the KinD control plane fails to come up on Fedora 34 Silverblue.

What you expected to happen:

Control plane to start using rootless Podman

How to reproduce it (as minimally and precisely as possible):

  1. Install Fedora 34 Silverblue
  2. Install KinD

Anything else we need to know?:

Environment:

  • kind version: (use kind version):

kind v0.11.0 go1.16.4 linux/amd64

  • Kubernetes version: (use kubectl version):

N/A

  • Docker version: (use docker info):
podman info
host:
  arch: amd64
  buildahVersion: 1.22.3
  cgroupControllers: []
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.29-2.fc34.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.29, commit: '
  cpus: 12
  distribution:
    distribution: fedora
    version: "34"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.14.9-200.fc34.x86_64
  linkmode: dynamic
  memFree: 2201493504
  memTotal: 33377628160
  ociRuntime:
    name: crun
    package: crun-1.1-1.fc34.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.1
      commit: 5b341a145c4f515f96f55e3e7760d1c79ec3cf1f
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.12-2.fc34.x86_64
    version: |-
      slirp4netns version 1.1.12
      commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 5126479872
  swapTotal: 8589930496
  uptime: 294h 53m 38.04s (Approximately 12.25 days)
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /var/home/adkaplan/.config/containers/storage.conf
  containerStore:
    number: 26
    paused: 0
    running: 1
    stopped: 25
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.7.1-2.fc34.x86_64
      Version: |-
        fusermount3 version: 3.10.4
        fuse-overlayfs: version 1.7.1
        FUSE library version 3.10.4
        using FUSE kernel interface version 7.31
  graphRoot: /var/home/adkaplan/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 398
  runRoot: /run/user/1000/containers
  volumePath: /var/home/adkaplan/.local/share/containers/storage/volumes
version:
  APIVersion: 3.3.1
  Built: 1630356396
  BuiltTime: Mon Aug 30 16:46:36 2021
  GitCommit: ""
  GoVersion: go1.16.6
  OsArch: linux/amd64
  Version: 3.3.1
  • OS (e.g. from /etc/os-release):
NAME=Fedora
VERSION="34.20211006.0 (Silverblue)"
ID=fedora
VERSION_ID=34
VERSION_CODENAME=""
PLATFORM_ID="platform:f34"
PRETTY_NAME="Fedora 34.20211006.0 (Silverblue)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:34"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora-silverblue/"
SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=34
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=34
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Silverblue"
VARIANT_ID=silverblue
OSTREE_VERSION='34.20211006.0'

adambkaplan avatar Nov 02 '21 23:11 adambkaplan

I see fuse-overlayfs, is this rootless podman? https://kind.sigs.k8s.io/docs/user/rootless/

BenTheElder avatar Nov 02 '21 23:11 BenTheElder

can you run with kind create cluster --retain (prevent cleanup on failure) and then capture / share kind export logs? you can then cleanup with kind delete cluster.

BenTheElder avatar Nov 02 '21 23:11 BenTheElder

I see fuse-overlayfs, is this rootless podman? https://kind.sigs.k8s.io/docs/user/rootless/

This is rootless podman, and I believe I configured the delegate.conf and iptables.conf files in /etc per the instructions.

capture / share kind export logs

I have posted them at the gist here - https://gist.github.com/adambkaplan/13e99a24c4d1f98475739a672a7ae019 I see several filesystem mount errors - I wonder if fuse-overlayfs is the culprit.

adambkaplan avatar Nov 03 '21 15:11 adambkaplan

Also added results of the systemctl hints from the main container process to the gist.

adambkaplan avatar Nov 03 '21 15:11 adambkaplan

I think we've had some other recent issues with fuse-overlayfs

BenTheElder avatar Nov 03 '21 16:11 BenTheElder

I would try kind from master first, this was merged recently https://github.com/kubernetes-sigs/kind/pull/2278

aojea avatar Nov 03 '21 21:11 aojea

If it helps in any way, i am possibly experiencing the same issue as the OP on fedora 34 ( also with podman and rootless ). Trying with kind installed from latest main ( at the time of writing commit c8b759f2e1444c0175259943899d768853e17241 ) results in the same error as before. Here's the log export: https://gist.github.com/mrWinston/d1414bfe3e9c553631d9c4731efc4773

Let me know if i can help with debugging in any way.

mrWinston avatar Nov 18 '21 08:11 mrWinston

Nov 18 08:23:01 kind-control-plane kubelet[3981]: W1118 08:23:01.649183 3981 fs.go:588] stat failed on /dev/mapper/luks-0d827a3f-2fc6-46cc-9f14-1c151674f80f with error: no such file or directory

@mrWinston you are hitting this problem https://github.com/kubernetes-sigs/kind/issues/2411 , that is not really something KIND can fix :/

aojea avatar Nov 18 '21 09:11 aojea

@aojea oh, yes. you're right, applying the workaround from the linked issue does indeed solve it for me. Thanks for the quick reply!

mrWinston avatar Nov 18 '21 12:11 mrWinston

This worked for me given this configuration (Fedora 35 with rootless Podman):

$ cat /etc/os-release                                                                
NAME="Fedora Linux"
VERSION="35 (Workstation Edition)"
ID=fedora
VERSION_ID=35
VERSION_CODENAME=""
PLATFORM_ID="platform:f35"
PRETTY_NAME="Fedora Linux 35 (Workstation Edition)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:35"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f35/system-administrators-guide/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=35
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=35
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Workstation Edition"
VARIANT_ID=workstation
$ podman version          
Version:      3.4.4
API Version:  3.4.4
Go Version:   go1.16.8
Built:        Wed Dec  8 22:45:07 2021
OS/Arch:      linux/amd64
$ podman info             
host:
  arch: amd64
  buildahVersion: 1.23.1
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.0-2.fc35.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.0, commit: '
  cpus: 32
  distribution:
    distribution: fedora
    variant: workstation
    version: "35"
  eventLogger: journald
  hostname: mbana-pc3
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.16.7-200.fc35.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 40270614528
  memTotal: 67404099584
  ociRuntime:
    name: crun
    package: crun-1.4.2-1.fc35.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.4.2
      commit: f6fbc8f840df1a414f31a60953ae514fa497c748
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.12-2.fc35.x86_64
    version: |-
      slirp4netns version 1.1.12
      commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 8589930496
  swapTotal: 8589930496
  uptime: 8h 14m 58.74s (Approximately 0.33 days)
plugins:
  log:
  - k8s-file
  - none
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /home/mbana/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 1
    stopped: 0
  graphDriverName: btrfs
  graphOptions: {}
  graphRoot: /home/mbana/.local/share/containers/storage
  graphStatus:
    Build Version: 'Btrfs v5.15.1 '
    Library Version: "102"
  imageStore:
    number: 2
  runRoot: /run/user/1000/containers
  volumePath: /home/mbana/.local/share/containers/storage/volumes
version:
  APIVersion: 3.4.4
  Built: 1638999907
  BuiltTime: Wed Dec  8 22:45:07 2021
  GitCommit: ""
  GoVersion: go1.16.8
  OsArch: linux/amd64
  Version: 3.4.4

$ kind --version     
kind version 0.11.1

Then

tee ~/.config/containers/storage.conf <<'EOF'
[storage]
driver = "btrfs"
EOF

systemctl restart --now podman
podman system reset

tee config.yml <<'EOF'
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- extraMounts:
  - hostPath: /dev/nvme0n1p1
    containerPath: /dev/nvme0n1p1
EOF

Run:

kubectl cluster-info --context kind-test-1
Kubernetes control plane is running at https://127.0.0.1:41285
CoreDNS is running at https://127.0.0.1:41285/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

mbana avatar Feb 12 '22 01:02 mbana

Possibly https://github.com/kubernetes-sigs/kind/pull/2584, certainly at least for this subthread https://github.com/kubernetes-sigs/kind/issues/2521#issuecomment-1036937681

BenTheElder avatar Apr 18 '23 05:04 BenTheElder

This has gotten dated, we can revisit this with contemporary fedora/kind versions if there remain issues.

BenTheElder avatar Apr 18 '23 05:04 BenTheElder

Oh wow this was old. I can confirm that kind + rootless podman was working just fine on Fedora 37 Silverblue.

adambkaplan avatar Apr 19 '23 19:04 adambkaplan