k3s icon indicating copy to clipboard operation
k3s copied to clipboard

ImageVolume broken

Open laurivosandi opened this issue 1 year ago • 2 comments

Environmental Info: K3s Version: k3s version v1.31.0+k3s1 (34be6d96) go version go1.22.5

Node(s) CPU architecture, OS, and Version: Linux kubetest 6.8.0-44-generic #44-Ubuntu SMP PREEMPT_DYNAMIC Tue Aug 13 13:35:26 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration: Single node

Describe the bug: The freshly arrived ImageVolume feature seems broken

Steps To Reproduce:

Deployed k3s 1.31 with:

INSTALL_K3S_EXEC="--kubelet-arg feature-gates=ImageVolume=true --kube-apiserver-arg feature-gates=ImageVolume=true"
INSTALL_K3S_CHANNEL=latest

Applied manifest:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx:1.14.2
          ports:
            - containerPort: 80
          volumeMounts:
            - name: model
              mountPath: /model
      volumes:
        - name: model
          image:
            pullPolicy: IfNotPresent
            reference: registry.k8s.io/conformance:v1.31.0

Expected behavior:

Pod starts with registry.k8s.io/conformance:v1.31.0 image contents mounted at /model

Actual behavior:

In Kubernetes Lens following error message is shown

Error: failed to generate container "558bb3f947d9e1a70c697143ba10b2795c58ee7bde9c4fa0e3b317acb02dbe7b" spec: failed to generate spec: failed to mkdir "": mkdir : no such file or directory

Additional context / logs:

Sep 12 07:33:18 kubetest k3s[2270]: E0912 07:33:18.926553    2270 log.go:32] "CreateContainer in sandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to generate container \"536ef130eb99ae8a9f69c175a410b6400a576c514f07ea5faf68b6dc638a8860\" spec: failed to generate spec: failed to mkdir \"\": mkdir : no such file or directory" podSandboxID="f785734aa1099aa8557e8d4e6cabe4de5d889901d31880caf2a3f972946d7f78"
Sep 12 07:33:18 kubetest k3s[2270]: E0912 07:33:18.926664    2270 kuberuntime_manager.go:1272] "Unhandled Error" err="container &Container{Name:nginx,Image:nginx:1.14.2,Command:[],Args:[],WorkingDir:,Ports:[]ContainerPort{ContainerPort{Name:,HostPort:0,ContainerPort:80,Protocol:TCP,HostIP:,},},Env:[]EnvVar{},Resources:ResourceRequirements{Limits:ResourceList{},Requests:ResourceList{},Claims:[]ResourceClaim{},},VolumeMounts:[]VolumeMount{VolumeMount{Name:model,ReadOnly:false,MountPath:/model,SubPath:,MountPropagation:nil,SubPathExpr:,RecursiveReadOnly:nil,},VolumeMount{Name:kube-api-access-rjk5r,ReadOnly:true,MountPath:/var/run/secrets/kubernetes.io/serviceaccount,SubPath:,MountPropagation:nil,SubPathExpr:,RecursiveReadOnly:nil,},},LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:nil,TerminationMessagePath:/dev/termination-log,ImagePullPolicy:IfNotPresent,SecurityContext:nil,Stdin:false,StdinOnce:false,TTY:false,EnvFrom:[]EnvFromSource{},TerminationMessagePolicy:File,VolumeDevices:[]VolumeDevice{},StartupProbe:nil,ResizePolicy:[]ContainerResizePolicy{},RestartPolicy:nil,} start failed in pod ai-test-67984f78cd-n8xqd_default(e55f166a-1f0f-4728-96e2-fde67da551e3): CreateContainerError: failed to generate container \"536ef130eb99ae8a9f69c175a410b6400a576c514f07ea5faf68b6dc638a8860\" spec: failed to generate spec: failed to mkdir \"\": mkdir : no such file or directory"
Sep 12 07:33:18 kubetest k3s[2270]: E0912 07:33:18.927733    2270 pod_workers.go:1301] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"nginx\" with CreateContainerError: \"failed to generate container \\\"536ef130eb99ae8a9f69c175a410b6400a576c514f07ea5faf68b6dc638a8860\\\" spec: failed to generate spec: failed to mkdir \\\"\\\": mkdir : no such file or directory\"" pod="default/ai-test-67984f78cd-n8xqd" podUID="e55f166a-1f0f-4728-96e2-fde67da551e3"

laurivosandi avatar Sep 12 '24 07:09 laurivosandi

I don't know where this error is coming from, but it is not code that lives within this repo. I suspect that the problem is either in the kubelet, or containerd.

The error itself is coming back from containerd (the runtime service), but I don't know if the container spec generated by the kubelet is incorrect, or if there's a bug in containerd, or what.

brandond avatar Sep 12 '24 17:09 brandond

Have you tried this with any other Kubernetes distro with containerd 1.7, or k3s with a different container runtime (cri-dockerd perhaps)?

brandond avatar Sep 13 '24 19:09 brandond

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 45 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

github-actions[bot] avatar Oct 28 '24 20:10 github-actions[bot]

Could I request to reopen this issue? Here is a very simple reproducer:

Create a simple cluster with K3d:

k3d cluster create playground \
  --image rancher/k3s:v1.31.2-k3s1 \
  --k3s-arg '--kube-apiserver-arg=feature-gates=ImageVolume=true@server:*' \
  --k3s-arg '--kubelet-arg=feature-gates=ImageVolume=true@server:*'

Watch events in a second terminal:

k get events -w

Create a pod

kubectl apply -f - << EOF
apiVersion: v1
kind: Pod
metadata:
  name: readonly-oci-volume-pod
spec:
  containers:
    - name: test
      image: registry.k8s.io/e2e-test-images/echoserver:2.3
      volumeMounts:
        - name: volume
          mountPath: /volume
  volumes:
    - name: volume
      image:
        reference: busybox:latest
        pullPolicy: IfNotPresent
EOF

The event log will contain:

LAST SEEN   TYPE      REASON                           OBJECT                         MESSAGE
17s         Normal    Starting                         node/k3d-playground-server-0   Starting kubelet.
17s         Warning   InvalidDiskCapacity              node/k3d-playground-server-0   invalid capacity 0 on image filesystem
17s         Normal    NodeHasSufficientMemory          node/k3d-playground-server-0   Node k3d-playground-server-0 status is now: NodeHasSufficientMemory
17s         Normal    NodeHasNoDiskPressure            node/k3d-playground-server-0   Node k3d-playground-server-0 status is now: NodeHasNoDiskPressure
17s         Normal    NodeHasSufficientPID             node/k3d-playground-server-0   Node k3d-playground-server-0 status is now: NodeHasSufficientPID
17s         Normal    NodeAllocatableEnforced          node/k3d-playground-server-0   Updated Node Allocatable limit across pods
17s         Normal    Starting                         node/k3d-playground-server-0   
17s         Normal    NodeReady                        node/k3d-playground-server-0   Node k3d-playground-server-0 status is now: NodeReady
14s         Normal    Synced                           node/k3d-playground-server-0   Node synced successfully
14s         Normal    NodePasswordValidationComplete   node/k3d-playground-server-0   Deferred node password secret validation complete
13s         Normal    RegisteredNode                   node/k3d-playground-server-0   Node k3d-playground-server-0 event: Registered Node k3d-playground-server-0 in Controller
0s          Normal    Scheduled                        pod/readonly-oci-volume-pod    Successfully assigned default/readonly-oci-volume-pod to k3d-playground-server-0
0s          Normal    Pulling                          pod/readonly-oci-volume-pod    Pulling image "busybox:latest"
0s          Normal    Pulled                           pod/readonly-oci-volume-pod    Successfully pulled image "busybox:latest" in 2.899s (2.899s including waiting). Image size: 2166802 bytes.
0s          Normal    Pulling                          pod/readonly-oci-volume-pod    Pulling image "registry.k8s.io/e2e-test-images/echoserver:2.3"
0s          Normal    Pulled                           pod/readonly-oci-volume-pod    Successfully pulled image "registry.k8s.io/e2e-test-images/echoserver:2.3" in 4.995s (4.995s including waiting). Image size: 8478041 bytes.
0s          Warning   Failed                           pod/readonly-oci-volume-pod    Error: failed to generate container "767e400ec7f3f654630fe92d33d701a01f2426f9e24718a8a7840eed6e5786cd" spec: failed to generate spec: failed to mkdir "": mkdir : no such file or directory
...<last 3 lines repeat>

rotty3000 avatar Nov 18 '24 18:11 rotty3000

As I asked above, are you sure containerd even supports this? It's still alpha, and the upstream announcement only mentions it working with the latest releases of cri-o. https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/

The Kubernetes alpha feature gate ImageVolume needs to be enabled on the API Server as well as the kubelet to make it functional. If that’s the case and the container runtime has support for the feature (like CRI-O ≥ v1.31), then an example pod.yaml like this can be created:

I suspect you'll need to wait for containerd to support this, or install and use cri-o instead of the containerd bundled with k3s.

brandond avatar Nov 18 '24 18:11 brandond

Ok, That's fair. I misunderstood. thanks for checking :)

rotty3000 avatar Nov 18 '24 18:11 rotty3000

for posterity, the containerd line that throws the error, I believe, is this one: https://github.com/containerd/containerd/blob/0aa8b58092a849543dee7680005bf33503a71bec/internal/cri/opts/spec_linux_opts.go#L114

rotty3000 avatar Nov 18 '24 18:11 rotty3000

containerd issue for this:

https://github.com/containerd/containerd/issues/10496

rotty3000 avatar Nov 18 '24 19:11 rotty3000

Question. Is k3s using this repo as the source of its containerd?

rotty3000 avatar Nov 18 '24 19:11 rotty3000

sorry, --> https://github.com/k3s-io/containerd

rotty3000 avatar Nov 18 '24 19:11 rotty3000

I think this answers my question https://github.com/k3s-io/k3s/blob/master/go.mod#L12

rotty3000 avatar Nov 18 '24 19:11 rotty3000

We maintain a small number of patches on top of containerd to allow embedding it within k3s and enable a couple additional snapshotters. Our rewrite support has also not been accepted by upstream.

https://github.com/k3s-io/containerd/compare/v1.7.23...v1.7.23-k3s1

brandond avatar Nov 18 '24 20:11 brandond

@brandond could you point me to the rewrite support so I can see what exactly that is? I'm trying to look to get my hands dirty. Sometimes a solution to an upstream project not accepting a particular feature is offering a change instead that enables the feature to be implemented through some form of extensibility.

rotty3000 avatar Nov 18 '24 21:11 rotty3000

Did you look at the commits in the link in the message you just replied to?

brandond avatar Nov 18 '24 21:11 brandond

.. right, sorry. long day! 😓

rotty3000 avatar Nov 18 '24 22:11 rotty3000

Hi, it is now implemented in containerd and should come in containerd 2.1.0 (actually in beta): https://github.com/containerd/containerd/pull/11510

fabienvauchelles avatar Mar 09 '25 08:03 fabienvauchelles

This PR should address this:

  • https://github.com/k3s-io/k3s/pull/12788

LaurentGoderre avatar Aug 26 '25 15:08 LaurentGoderre

Seems to be working with k3s 1.33.5 by enabling the gate flags

laurivosandi avatar Oct 27 '25 06:10 laurivosandi