nydus icon indicating copy to clipboard operation
nydus copied to clipboard

When using kubernetes and nydus,container failed to start with an error “target snapshot already exists”

Open cl0udee opened this issue 1 year ago • 7 comments

Additional Information

The following information is very important in order to help us to help you. Omission of the following details may delay your support request or receive no attention at all.

Version of nydus being used (nydusd --version)

Version: 	v2.1.1
Git Commit: 	2fd7070bf7c08ba8667a375ecf5ab4ca3963a184
Build Time: 	2022-11-06T11:14:20.450697142Z
Profile: 	release
Rustc: 		rustc 1.61.0 (fe5b13d68 2022-05-18)

Version of nydus-snapshotter being used (containerd-nydus-grpc --version)

Version:     v0.4.0
Revision:    1e18acbf9d39588d39d0276a423e33ebeeb3462b
Go version:  go1.18.6
Build time:  2022-11-30T11:40:06

Kernel information (uname -r)

4.19.90-2102.2.0.0062.ctl2.x86_64

GNU/Linux Distribution, if applicable (cat /etc/os-release)

command result: cat /etc/os-release

containerd-nydus-grpc command line used, if applicable (ps aux | grep containerd-nydus-grpc)

/usr/bin/containerd-nydus-grpc --config-path /etc/nydus/nydusd-config.json --address /run/containerd/containerd-nydus-grpc.sock --nydusd-path /usr/bin/nydusd --nydusimg-path /usr/bin/nydus-image --log-to-stdout

client command line used, if applicable (such as: nerdctl, docker, kubectl, ctr)

kubectl apply -f test-pod.yaml

Screenshots (if applicable)

Details about issue

When I use nerdctl, such as

nerdctl --snapshotter nydus run --rm -it centos:v1 bash

the container can be successfully created and run normally. However, when I switch to Kubernetes, it gives me the following error:

Warning  FailedCreatePodSandBox  <invalid>  kubelet  Failed to create pod sandbox: rpc error: code = AlreadyExists desc = failed to get sandbox image "xxx/pause:3.6": failed to pull image "xxx/pause:3.6": failed to pull and unpack image "xxx/pause:3.6": unable to prepare extraction snapshot: target snapshot "sha256:xxx": already exists

I suspect this issue is related to the pause image because nerdctl, which doesn't involve the pause image, can start successfully. I have also tried using

ctr -n k8s.io content fetch $pause-image-name

but it doesn't work.

This is very confusing for me, especially because I was able to successfully launch pods using Kubernetes a while ago. However, after some time has passed, it is no longer able to start.

cl0udee avatar Dec 21 '23 10:12 cl0udee

Cloud you try ctr images delete xxx/pause:3.6 --sync ?

imeoer avatar Dec 21 '23 10:12 imeoer

Cloud you try ctr images delete xxx/pause:3.6 --sync ?

Need to ensure the pod using the image has been deleted first.

imeoer avatar Dec 21 '23 10:12 imeoer

Cloud you try ctr images delete xxx/pause:3.6 --sync ?

Need to ensure the pod using the image has been deleted first.

I have already deleted all the pause images, but I still receive the same error. target snapshot "sha256:xxx": already exists

cl0udee avatar Dec 21 '23 10:12 cl0udee

Is this a work in progress (WIP)? I'm experiencing the same issue in kata 3.2.0. @imeoer

kinderyj avatar Dec 26 '23 03:12 kinderyj

This appears to be an inconsistency in containerd snapshot metadata, try the following commands:

ctr -n k8s.io content ls | grep sha256:xxx ctr -n k8s.io content rm $blob_id

But we still don't have an way to reproduce it.

imeoer avatar Jun 18 '24 06:06 imeoer