storage
storage copied to clipboard
`podman load` unreliable with ZFS driver
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
Occasionally podman fails to load image with error:
Error: payload does not match any of the supported image formats (oci, oci-archive, dir, docker-archive)
Let's say I have a few images to load from archives:
root in /tmp/tmp.BcVHmxQCba/images
❯ ls -1
base.tar
micro.tar
minio.tar
nexus.tar
nginx1.18.tar
openjdk8-jre.tar
step-ca.tar
systemd.tar
At some point, ZFS driver just fails unexpectedly:
root in /tmp/tmp.BcVHmxQCba/images
❯ for IMG in $(ls -1); do podman load -i $IMG; done
Getting image source signatures
Copying blob fbde709b7a5b done
Copying blob af53933dffc5 done
Copying config 24890d2f24 done
Writing manifest to image destination
Storing signatures
Loaded image(s): registry.lab.skarbek.name/bootstrap/base:latest
Getting image source signatures
Copying blob fbde709b7a5b skipped: already exists
Copying config 9b325ce58f done
Writing manifest to image destination
Storing signatures
Loaded image(s): registry.lab.skarbek.name/bootstrap/micro:latest
Getting image source signatures
Copying blob fbde709b7a5b skipped: already exists
Copying blob fbde709b7a5b skipped: already exists
Copying blob af53933dffc5 skipped: already exists
Copying blob c6ccedf52c3b done
Error: payload does not match any of the supported image formats (oci, oci-archive, dir, docker-archive)
Getting image source signatures
Copying blob fbde709b7a5b skipped: already exists
Copying blob fbde709b7a5b skipped: already exists
Copying blob af53933dffc5 skipped: already exists
Copying blob b14c4226ddb2 done
Copying blob d548657cfebd done
Error: payload does not match any of the supported image formats (oci, oci-archive, dir, docker-archive)
Getting image source signatures
Copying blob fbde709b7a5b skipped: already exists
Copying blob fbde709b7a5b skipped: already exists
Copying blob af53933dffc5 skipped: already exists
Copying blob 69ee9abc0771 done
Error: payload does not match any of the supported image formats (oci, oci-archive, dir, docker-archive)
Getting image source signatures
Copying blob fbde709b7a5b skipped: already exists
Copying blob af53933dffc5 skipped: already exists
Copying blob b14c4226ddb2 skipped: already exists
ERRO[0000] Error unmounting /var/lib/containers/storage/zfs/graph/dce1852f14fde9fa55fb420d688640efc747b7be931c00c11fe444aabfde63b0: invalid argument storage-driver=zfs
Error: payload does not match any of the supported image formats (oci, oci-archive, dir, docker-archive)
Getting image source signatures
Copying blob fbde709b7a5b skipped: already exists
Copying blob fbde709b7a5b skipped: already exists
Copying blob af53933dffc5 skipped: already exists
Copying blob 7f16323b4395 done
Error: payload does not match any of the supported image formats (oci, oci-archive, dir, docker-archive)
Getting image source signatures
Copying blob fbde709b7a5b skipped: already exists
Copying blob af53933dffc5 skipped: already exists
ERRO[0000] Error unmounting /var/lib/containers/storage/zfs/graph/dce1852f14fde9fa55fb420d688640efc747b7be931c00c11fe444aabfde63b0: invalid argument storage-driver=zfs
Error: payload does not match any of the supported image formats (oci, oci-archive, dir, docker-archive)
Debug log from a single occurrence of the error:
root in /tmp/tmp.BcVHmxQCba/images
❯ podman --log-level debug load -i systemd.tar
INFO[0000] podman filtering at log level debug
DEBU[0000] Called load.PersistentPreRunE(podman --log-level debug load -i systemd.tar)
DEBU[0000] Merged system config "/usr/share/containers/containers.conf"
DEBU[0000] Using conmon: "/usr/bin/conmon"
DEBU[0000] Initializing boltdb state at /var/lib/containers/storage/libpod/bolt_state.db
DEBU[0000] Using graph driver zfs
DEBU[0000] Using graph root /var/lib/containers/storage
DEBU[0000] Using run root /run/containers/storage
DEBU[0000] Using static dir /var/lib/containers/storage/libpod
DEBU[0000] Using tmp dir /run/libpod
DEBU[0000] Using volume path /var/lib/containers/storage/volumes
DEBU[0000] Set libpod namespace to ""
DEBU[0000] [graphdriver] trying provided driver "zfs"
DEBU[0000] ID:b9c3bd02-6a39-4868-a495-257322cb4cf8 START zfs list -rHp -t filesystem -o name,origin,used,available,mountpoint,compression,type,volsize,quota,referenced,written,logicalused,usedbydataset b51d7037-4cc1-4ad9-9e50-b533fd87b24e/datafs/var/lib/containers storage-driver=zfs
DEBU[0000] ID:b9c3bd02-6a39-4868-a495-257322cb4cf8 FINISH storage-driver=zfs
DEBU[0000] Initializing event backend journald
DEBU[0000] configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument
DEBU[0000] configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument
DEBU[0000] Using OCI runtime "/usr/bin/crun"
INFO[0000] Found CNI network cni0 (type=bridge) at /etc/cni/net.d/10-cni0.conflist
INFO[0000] Found CNI network podman (type=bridge) at /etc/cni/net.d/87-podman.conflist
DEBU[0000] Default CNI network name podman is unchangeable
INFO[0000] Setting parallel job count to 13
DEBU[0000] Loading image from "systemd.tar"
DEBU[0000] -> Attempting to load "systemd.tar" as an OCI directory
DEBU[0000] parsed reference into "[zfs@/var/lib/containers/storage+/run/containers/storage]localhost/systemd.tar:latest"
DEBU[0000] Copying source image systemd.tar: to destination image [zfs@/var/lib/containers/storage+/run/containers/storage]localhost/systemd.tar:latest
DEBU[0000] Error loading systemd.tar: initializing source oci:systemd.tar:: open systemd.tar/index.json: not a directory
DEBU[0000] -> Attempting to load "systemd.tar" as an OCI archive
DEBU[0001] Error deleting temporary directory: <nil>
DEBU[0001] Error loading systemd.tar: loading index: open /var/tmp/oci993717541/index.json: no such file or directory
DEBU[0001] -> Attempting to load "systemd.tar" as a Docker dir
DEBU[0001] parsed reference into "[zfs@/var/lib/containers/storage+/run/containers/storage]localhost/systemd.tar:latest"
DEBU[0001] Copying source image systemd.tar to destination image [zfs@/var/lib/containers/storage+/run/containers/storage]localhost/systemd.tar:latest
DEBU[0001] Using blob info cache at /var/lib/containers/cache/blob-info-cache-v1.boltdb
DEBU[0001] Error loading systemd.tar: determining manifest MIME type for dir:systemd.tar: open systemd.tar/manifest.json: not a directory
DEBU[0001] -> Attempting to load "systemd.tar" as a Docker archive
DEBU[0001] No compression detected
DEBU[0001] parsed reference into "[zfs@/var/lib/containers/storage+/run/containers/storage]registry.lab.skarbek.name/bootstrap/systemd:latest"
DEBU[0001] Copying source image systemd.tar:registry.lab.skarbek.name/bootstrap/systemd:latest to destination image [zfs@/var/lib/containers/storage+/run/containers/storage]registry.lab.skarbek.name/bootstrap/systemd:latest
DEBU[0001] Using blob info cache at /var/lib/containers/cache/blob-info-cache-v1.boltdb
DEBU[0001] No compression detected
DEBU[0001] No compression detected
DEBU[0001] No compression detected
DEBU[0001] IsRunningImageAllowed for image docker-archive:
DEBU[0001] Using default policy section
DEBU[0001] Requirement 0: allowed
DEBU[0001] Overall: allowed
Getting image source signatures
DEBU[0001] Manifest has MIME type application/vnd.docker.distribution.manifest.v2+json, ordered candidate list [application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.v1+prettyjws, application/vnd.oci.image.manifest.v1+json, application/vnd.docker.distribution.manifest.v1+json]
DEBU[0001] ... will first try using the original manifest unmodified
DEBU[0001] Skipping blob sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef (already present):
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 5f70bf18a086 skipped: already exists
Copying blob 0af45d730ac3 skipped: already exists
DEBU[0001] mount("b51d7037-4cc1-4ad9-9e50-b533fd87b24e/datafs/var/lib/containers/dce1852f14fde9fa55fb420d688640efc747b7be931c00c11fe444aabfde63b0", "/var/lib/containers/storage/zfs/graph/dce1852f14fde9fa55fb420d688640efc747b7be931c00c11fe444aabfde63b0", "") storage-driver=zfs
ERRO[0001] Error unmounting /var/lib/containers/storage/zfs/graph/dce1852f14fde9fa55fb420d688640efc747b7be931c00c11fe444aabfde63b0: invalid argument storage-driver=zfs
DEBU[0001] Error loading systemd.tar: trying to reuse blob sha256:5576786dd32999d193c0a8e82b1235611131c8128404af9999195884a9d3be9e at destination: reading layer "dce1852f14fde9fa55fb420d688640efc747b7be931c00c11fe444aabfde63b0" for blob "sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef": error creating zfs mount: mount b51d7037-4cc1-4ad9-9e50-b533fd87b24e/datafs/var/lib/containers/dce1852f14fde9fa55fb420d688640efc747b7be931c00c11fe444aabfde63b0:/var/lib/containers/storage/zfs/graph/dce1852f14fde9fa55fb420d688640efc747b7be931c00c11fe444aabfde63b0: no such file or directory
DEBU[0001] Error loading systemd.tar: payload does not match any of the supported image formats (oci, oci-archive, dir, docker-archive)
Error: payload does not match any of the supported image formats (oci, oci-archive, dir, docker-archive)
Working solution for now is to remove podmanstorage db/configuration by either destroying each container/image dataset and rm -rf everything inside /var/lib/containers or simply destroying the whole dataset mounted as /var/lib/containers and create fresh empty one.
podman can work weeks, months without issue on one host, this is sporadic and hard to reliably reproduce. I was looking for a good trigger of this behavior but unsuccessfully for now, reporting this for feature reference.
Steps to reproduce the issue:
-
Use
podmanfor some time -
Try to load image from archive
Describe the results you received:
podman fails to load image from archive
Describe the results you expected:
podman loads image from archive without issues
Additional information you deem important (e.g. issue happens only occasionally):
Output of podman version:
Version: 3.4.1
API Version: 3.4.1
Go Version: go1.16.8
Built: Wed Oct 20 16:35:28 2021
OS/Arch: linux/amd64
Output of podman info --debug:
host:
arch: amd64
buildahVersion: 1.23.1
cgroupControllers:
- cpuset
- cpu
- io
- memory
- hugetlb
- pids
- misc
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.0.30-2.fc34.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.0.30, commit: '
cpus: 4
distribution:
distribution: fedora
variant: workstation
version: "34"
eventLogger: journald
hostname: ishikawa
idMappings:
gidmap: null
uidmap: null
kernel: 5.14.13-200.fc34.x86_64
linkmode: dynamic
logDriver: k8s-file
memFree: 1632423936
memTotal: 16494628864
ociRuntime:
name: crun
package: crun-1.2-1.fc34.x86_64
path: /usr/bin/crun
version: |-
crun version 1.2
commit: 4f6c8e0583c679bfee6a899c05ac6b916022561b
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
os: linux
remoteSocket:
path: /run/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: false
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.1.12-2.fc34.x86_64
version: |-
slirp4netns version 1.1.12
commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
libslirp: 4.4.0
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.0
swapFree: 4891500544
swapTotal: 8589930496
uptime: 99h 0m 28.02s (Approximately 4.12 days)
plugins:
log:
- k8s-file
- none
- journald
network:
- bridge
- macvlan
volume:
- local
registries:
10.8.0.254:8082:
Blocked: false
Insecure: true
Location: 10.8.0.254:8082
MirrorByDigestOnly: false
Mirrors: null
Prefix: 10.8.0.254:8082
10.88.0.249:8082:
Blocked: false
Insecure: true
Location: 10.88.0.249:8082
MirrorByDigestOnly: false
Mirrors: null
Prefix: 10.88.0.249:8082
10.88.0.252:8082:
Blocked: false
Insecure: true
Location: 10.88.0.252:8082
MirrorByDigestOnly: false
Mirrors: null
Prefix: 10.88.0.252:8082
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- docker.io
- quay.io
store:
configFile: /etc/containers/storage.conf
containerStore:
number: 0
paused: 0
running: 0
stopped: 0
graphDriverName: zfs
graphOptions: {}
graphRoot: /var/lib/containers/storage
graphStatus:
Compression: lz4
Parent Dataset: b51d7037-4cc1-4ad9-9e50-b533fd87b24e/datafs/var/lib/containers
Parent Quota: "no"
Space Available: "44588359680"
Space Used By Parent: "493617152"
Zpool: b51d7037-4cc1-4ad9-9e50-b533fd87b24e
Zpool Health: ONLINE
imageStore:
number: 8
runRoot: /run/containers/storage
volumePath: /var/lib/containers/storage/volumes
version:
APIVersion: 3.4.1
Built: 1634740528
BuiltTime: Wed Oct 20 16:35:28 2021
GitCommit: ""
GoVersion: go1.16.8
OsArch: linux/amd64
Version: 3.4.1
Package info (e.g. output of rpm -q podman or apt list podman):
podman-3.4.1-1.fc34.x86_64
Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)
Yes
Additional environment details (AWS, VirtualBox, physical, etc.): Independent of host type, observed on bare metal, KVM, VMWare, AWS. Observed on Fedora (32, 33, 34) and RHEL 8. OpenZFS 2.0.x and 2.1.x.
Error's coming out of c/common image library, but I suspect it's hiding a real error from c/storage. @vrothberg PTAL
trying to reuse blob sha256:5576786dd32999d193c0a8e82b1235611131c8128404af9999195884a9d3be9e at destination: reading layer "dce1852f14fde9fa55fb420d688640efc747b7be931c00c11fe444aabfde63b0" for blob "sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef": error creating zfs mount: mount b51d7037-4cc1-4ad9-9e50-b533fd87b24e/datafs/var/lib/containers/dce1852f14fde9fa55fb420d688640efc747b7be931c00c11fe444aabfde63b0:/var/lib/containers/storage/zfs/graph/dce1852f14fde9fa55fb420d688640efc747b7be931c00c11fe444aabfde63b0: no such file or directory
Probably ZFS related which I have absolutely no experience in. @giuseppe, do you have advice?
We have no clue and little experience with zfs, I would recommend you switch to Overlay,