nydus
nydus copied to clipboard
[GLCC 2024] content not found: switch containerd to use nydus snapshotter
复现方式: 1:在k8s集群中运行过oci镜像pod,pod数大概30个左右 2:直接通过官方文档部署nydus,然后将源oci镜像改为Nydus格式镜像 3:apply 资源清单,将oci镜像升级到nydus镜像,会报错
报错如下: 1:nydus错误:level=info msg="Prepares active snapshot k8s.io/346/extract-762681717-aHxR sha256:1cb555415fd3286cbb792ee44a0e6b10f7c08f5a581d7342cbd2ceec7e3aa4af, nydusd should start afterwards" key="k8s.io/346/extract-762681717-aHxR sha256:1cb555415fd3286cbb792ee44a0e6b10f7c08f5a581d7342cbd2ceec7e3aa4af" parent=
2: describe pod 报错 Failed to create pod sandbox: rpc error: code = NotFound desc = failed to create containerd container: error unpacking image: failed to extract layer sha256:1cb555415fd3286cbb792ee44a0e6b10f7c08f5a581d7342cbd2ceec7e3aa4af: failed to get reader from content store: content digest sha256:7582c2cc65ef30105b84c1c6812f71c8012663c6352b01fe2f483238313ab0ed: not found
咨询相关人员反馈: snapshotter 只考虑了 oci 和 nydus 都能跑,但没有迁移之前的 oci layer 元数据到 snapshotter 到它的 db
It seems related to https://github.com/dragonflyoss/image-service/issues/878, can you retry with the discard_unpacked_layers = true
option?
它似乎与#878
discard_unpacked_layers = true
有关,您可以使用该选项重试吗?
I will try again and give you feedback if possible
它似乎与#878
discard_unpacked_layers = true
有关,您可以使用该选项重试吗?I will try again and give you feedback if possible
我试过,还是不行,错误和上次一致
From the offline discussion, when considering the switch to nydus snapshotter, it was noted that the bolt db of nydus snapshotter does not have OCI image records, whereas containerd does. Consequently, containerd attempted to read the image's blobs and unpack from the content store, then fails. (it appears that containerd had removed the image from the content store at some point?) As a workaround, the command ctr -n k8s.io content fetch $lost_image
can fix it.
ctr content fetch $lost_image
can fix it.
Maybe we need to add -n k8s.io
args
从离线讨论中,在考虑切换到 nydus snapshotter 时,注意到 nydus snapshotter 的 bolt db 没有 OCI 图像记录,而 containerd 有。因此,containerd 尝试读取图像的 blob 并从内容存储中解压缩,然后失败了。(似乎 containerd 在某个时候从内容存储中删除了图像?)作为解决方法,该命令
ctr -n k8s.io content fetch $lost_image
可以修复它。
Indeed, under the guidance of Yan Song, I tried to solve this problem, and it has been solved. ctr-n k8s.io content fetch should be used to deal with this error, and I think this bug can be fixed for the convenience of the existing environment
Will keep tracking this, we'd better handle the compatibility automatically in nydus snapshotter.
Hi @xiangshen123, what containerd version are you using?
I ran the same OCI image by nerdctl
, first overlayfs-snapshotter, then nydus-snapshotter. But everything goes fine for me
nerdctl run -it java:latest bash
# It succeeded
nerdctl --snapshotter nydus run -it java:latest bash
# It succeeded too
嗨@xiangshen123,您使用的是哪个容器版本?我通过运行相同的OCI映像,首先是覆盖快照器,然后是nydus快照器。但对我来说一切都很好
nerdctl
nerdctl run -it java:latest bash # It succeeded nerdctl --snapshotter nydus run -it java:latest bash # It succeeded too
I used containerd:1.6.9 and crictl as the client