Multi-stage builds fail when run in kind
Actual behavior When trying to run a multi-stage build with Kaniko in a kind cluster, specifically of https://github.com/GoogleContainerTools/skaffold/blob/main/examples/microservices/leeroy-web/Dockerfile, it fails with:
error building image: deleting file system after stage 0: unlinkat //product_uuid: device or resource busy
EDIT: Interestingly, it seems that adding --ignore-path=/product_uuid to the Kaniko args gets rid of the error. I don't know if this is something specific to that file, since I have stumbled across a reference to the exact same error, including //product_uuid, at https://github.com/mattmoor/mink/blob/b9148a39b2d8bbc69ca9aaf5e89a7613c0b179d8/.github/workflows/minkind-cli.yaml#L150-L155.
Expected behavior I expect multi-stage builds to succeed on kind.
To Reproduce Steps to reproduce the behavior:
- Run a Kaniko build of https://github.com/GoogleContainerTools/skaffold/blob/main/examples/microservices/leeroy-web/Dockerfile in a kind cluster.
Additional Information
- Dockerfile: https://github.com/GoogleContainerTools/skaffold/blob/main/examples/microservices/leeroy-web/Dockerfile
- Build Context: https://github.com/GoogleContainerTools/skaffold/tree/main/examples/microservices/leeroy-web
- Kaniko Image (fully qualified with digest): v1.8.1, gcr.io/kaniko-project/executor@sha256:b44b0744b450e731b5a5213058792cd8d3a6a14c119cf6b1f143704f22a7c650
- Kind v0.14.0, with either k8s 1.22.7 or 1.24.0
Triage Notes for the Maintainers
| Description | Yes/No |
|---|---|
| Please check if this a new feature you are proposing |
|
| Please check if the build works in docker but not in kaniko |
|
Please check if this error is seen when you use --cache flag |
|
| Please check if your dockerfile is a multistage dockerfile |
|
This error still appear during create multi stage docker image via Kind k8s cluster.
kaniko INFO[0113] Deleting filesystem...
kaniko error building image: deleting file system after stage 0: unlinkat //product_uuid: device or resource busy
Also seeing this same error when building in kind, with latest
Same issue
Same issue with kind and kaniko. My Dockerfile.
FROM maven:3.8.7-openjdk-18 as builder
COPY src /usr/src/app/src
COPY pom.xml /usr/src/app
RUN mvn -f /usr/src/app/pom.xml clean package
FROM openjdk:18-jdk-alpine3.15
COPY --from=builder /usr/src/app/target/app.jar /usr/app/application.jar
ENTRYPOINT ["java", "-jar", "/usr/app/application.jar"]
Error:
INFO[0239] Taking snapshot of full filesystem...
INFO[0240] Saving file usr/src/app/target/app-0.0.1-SNAPSHOT.jar for later use
INFO[0240] Deleting filesystem...
error building image: deleting file system after stage 0: unlinkat //product_uuid: device or resource busy
time="2023-01-27T10:41:03.882Z" level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 1
But it work ok if I run kaniko in docker, not kind.
I fixed it by add ignorePaths to scaffold yaml:
ignorePaths:
- /product_uuid
- image: job-runner
context: ../../
kaniko:
cache: {}
ignorePaths:
- /product_uuid
dockerfile: microservices/job-runner/docker/Dockerfile
For normal kaniko cli use arg: --ignore-path=/product_uuid
So I've hit this, and for reliability reasons it was necessary for me to investigate what the cause was. I do not fully understand all the components involved, but I have a working theory.
To my understanding Kaniko needs to be able to function without the ability to chroot. This means that in order to execute docker commands, it needs to obliterate the root filesystem of the container its running in. This means that the only places Kaniko can hold state are either in the memory of the build process, or potentially in a filesystem location that is unlikely to conflict with the image being built and also exempt from snapshotting.
Looking at kind, it seems that it wants to provide its own versions of the following files:
/sys/class/dmi/id/product_name
/sys/class/dmi/id/product_uuid
/sys/devices/virtual/dmi/id/product_uuid
However, rather than bind-mounting them to individual instances, these are copied into the root-filesystem of the container and then bind-mounted from there. This is mostly inferred from https://github.com/kubernetes-sigs/kind/blob/main/images/base/files/kind/bin/mount-product-files.sh. When Kaniko clears the root filesystem it attempts to delete these files. The bind mounting may explain why they cannot be deleted given that Linux is usually incredibly forgiving of deleting files which are in use.
From this I conclude that to ensure Kaniko works correctly, you probably want to ignore both /product_uuid and /product_name. It's unclear to me why the presence of /product_name hasn't been observed to be an issue.
Oddly, I've only seen this issue in CI and not on a local kind deploy. One factor appears to be the host system. Although product_uuid and product_name are always copied in, they are only bind mounted if the host's /sys defines these paths, which in turn may depend on such things as the host's firmware since they are derived from DMI information. On my local Linux machine, only /sys/class/dmi/id/product_name but not /sys/class/dmi/id/product_uuid or /sys/devices/virtual/dmi/id/product_uuid are present so product_uuid is never bind-mounted. I do not yet have an explanation for why Kaniko does not fail for me locally on deleting /product_name. Establishing a root shell into a Kind pod on my local machine, I also had no issue deleting /product_name.
Got exactly same issue but it's random. I'm using the version v1.12.1.
But I can't ignore the path in my case, I want to export it in second image