Executor deletes the context directory in multi-stage builds resulting in loss of command pathing
Actual behavior The context directory is deleted when targeting a second stage of a multi-stage build when the second stage relies on a prior stage.
Expected behavior The context directory should not be deleted.
To Reproduce Steps to reproduce the behavior:
- Start an interactive session of the debug executor with:
docker run -it --entrypoint sh gcr.io/kaniko-project/executor:debug-v1.3.0 - Run the following commands to create a builds directory and Dockerfile
mkdir builds
cd builds
vi Dockerfile
Insert the following into the Dockerfile and save
FROM node:14.15-alpine3.12 AS base
LABEL type="build"
FROM base as build-and-test
LABEL type="build-and-test"
- Run
/kaniko/executor --context /builds --no-push --target build-and-test
Observe the following
INFO[0000] Resolved base name node:14.15-alpine3.12 to base
INFO[0000] Resolved base name base to build-and-test
INFO[0000] Retrieving image manifest node:14.15-alpine3.12
INFO[0000] Retrieving image node:14.15-alpine3.12
INFO[0000] Retrieving image manifest node:14.15-alpine3.12
INFO[0000] Retrieving image node:14.15-alpine3.12
INFO[0001] Built cross stage deps: map[]
INFO[0001] Retrieving image manifest node:14.15-alpine3.12
INFO[0001] Retrieving image node:14.15-alpine3.12
INFO[0002] Retrieving image manifest node:14.15-alpine3.12
INFO[0002] Retrieving image node:14.15-alpine3.12
INFO[0002] Executing 0 build triggers
INFO[0002] Skipping unpacking as no commands require it.
INFO[0002] LABEL type="build"
INFO[0002] Applying label type=build
INFO[0002] Storing source image from stage 0 at path /kaniko/stages/0
INFO[0006] Deleting filesystem...
INFO[0006] Base image from previous stage 0 found, using saved tar at path /kaniko/stages/0
INFO[0006] Executing 0 build triggers
INFO[0006] Skipping unpacking as no commands require it.
INFO[0006] LABEL type="build-and-test"
INFO[0006] Applying label type=build-and-test
INFO[0006] Skipping push to container registry due to --no-push flag
sh: getcwd: No such file or directory
(unknown) #
The current working directory has been deleted.
Commands from the current working directory will not function
(unknown) # ls
sh: getcwd: No such file or directory
(unknown) # /busybox/ls
sh: getcwd: No such file or directory
When running with the Gitlab Runner Operator for Openshift, this behavior causes a loss of command pathing and the ability to locate the busybox directory itself, even if you first change directories to root (outside the current context)
.gitlab-ci.yaml
stages:
- build
# TEMPLATES
.runner_tags_template: &runners
tags:
- pdx
- dind
.except_master_and_prodfix_template: &except_master_and_prodfix
except:
- /^prodfix\/.*$/
- master
- tags
# BUILD
build:
stage: build
<<: *runners
<<: *except_master_and_prodfix
image:
name: gcr.io/kaniko-project/executor:debug-v1.3.0
entrypoint: ["sh"]
script:
- cd /
- pwd
- /kaniko/executor
--context $CI_PROJECT_DIR
--no-push
--target build-and-test
- cd /
- pwd
- cd /busybox
- ls
gitlab build log (note the directory not found message appears in the log before the command that caused it)
[0KRunning with gitlab-runner 12.9.0 (4c96e5ad)
[0;m[0K on pdx-gitlab-3-runner-d7f85cf7f-stxzl _TrpUuzy
[0;msection_start:1612649568:prepare_executor
[0K[0K[36;1mPreparing the "kubernetes" executor[0;m
[0;m[0KUsing Kubernetes namespace: gitlab-runners
[0;m[0KUsing Kubernetes executor with image gcr.io/kaniko-project/executor:debug-v1.3.0 ...
[0;msection_end:1612649568:prepare_executor
[0Ksection_start:1612649568:prepare_script
[0K[0K[36;1mPreparing environment[0;m
[0;mWaiting for pod gitlab-runners/runner-trpuuzy-project-20222355-concurrent-0dhzst to be running, status is Pending
Waiting for pod gitlab-runners/runner-trpuuzy-project-20222355-concurrent-0dhzst to be running, status is Pending
Running on runner-trpuuzy-project-20222355-concurrent-0dhzst via pdx-gitlab-3-runner-d7f85cf7f-stxzl...
section_end:1612649574:prepare_script
[0Ksection_start:1612649574:get_sources
[0K[0K[36;1mGetting source from Git repository[0;m
[0;m[32;1mFetching changes with git depth set to 50...[0;m
Initialized empty Git repository in /builds/preciselydata/cloud/pdx/product_definition_tool/.git/
[32;1mCreated fresh repository.[0;m
From https://gitlab.com/preciselydata/cloud/pdx/product_definition_tool
* [new ref] 7d969d80cff9d699e51b126e0359937c2cb5a526 -> refs/pipelines/252507426
* [new branch] testBusyBox -> origin/testBusyBox
* [new tag] 1.0.1468.f3c0870e -> 1.0.1468.f3c0870e
* [new tag] 1.0.1469.7169ceb0 -> 1.0.1469.7169ceb0
* [new tag] 1.0.1470.111f6d88 -> 1.0.1470.111f6d88
* [new tag] 1.0.1471.98614d1c -> 1.0.1471.98614d1c
* [new tag] 1.0.1473.e1ce3ccd -> 1.0.1473.e1ce3ccd
* [new tag] 1.0.1474.1087dd94 -> 1.0.1474.1087dd94
* [new tag] 1.0.1475.3c04842a -> 1.0.1475.3c04842a
[32;1mChecking out 7d969d80 as testBusyBox...[0;m
[32;1mSkipping Git submodules setup[0;m
section_end:1612649580:get_sources
[0Ksection_start:1612649580:restore_cache
[0K[0K[36;1mRestoring cache[0;m
[0;msection_end:1612649580:restore_cache
[0Ksection_start:1612649580:download_artifacts
[0K[0K[36;1mDownloading artifacts[0;m
[0;msection_end:1612649580:download_artifacts
[0Ksection_start:1612649580:build_script
[0K[0K[36;1mRunning before_script and script[0;m
[0;m[32;1m$ cd /[0;m
[32;1m$ pwd[0;m
/
[32;1m$ /kaniko/executor --context $CI_PROJECT_DIR --no-push --target build-and-test[0;m
[36mINFO[0m[0000] Resolved base name node:14.15-alpine3.12 to base
[36mINFO[0m[0000] Resolved base name base to build-and-test
[36mINFO[0m[0000] Using dockerignore file: /builds/preciselydata/cloud/pdx/product_definition_tool/.dockerignore
[36mINFO[0m[0000] Retrieving image manifest node:14.15-alpine3.12
[36mINFO[0m[0000] Retrieving image node:14.15-alpine3.12
E0206 22:12:25.176806 18 aws_credentials.go:77] while getting AWS credentials NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
[36mINFO[0m[0006] Retrieving image manifest node:14.15-alpine3.12
[36mINFO[0m[0006] Retrieving image node:14.15-alpine3.12
[36mINFO[0m[0007] Built cross stage deps: map[]
[36mINFO[0m[0007] Retrieving image manifest node:14.15-alpine3.12
[36mINFO[0m[0007] Retrieving image node:14.15-alpine3.12
[36mINFO[0m[0007] Retrieving image manifest node:14.15-alpine3.12
[36mINFO[0m[0007] Retrieving image node:14.15-alpine3.12
[36mINFO[0m[0007] Executing 0 build triggers
[36mINFO[0m[0007] Skipping unpacking as no commands require it.
[36mINFO[0m[0007] LABEL type="build"
[36mINFO[0m[0007] Applying label type=build
[36mINFO[0m[0007] Storing source image from stage 0 at path /kaniko/stages/0
[36mINFO[0m[0008] Deleting filesystem...
[36mINFO[0m[0008] Base image from previous stage 0 found, using saved tar at path /kaniko/stages/0
[36mINFO[0m[0008] Executing 0 build triggers
[36mINFO[0m[0009] Skipping unpacking as no commands require it.
[36mINFO[0m[0009] LABEL type="build-and-test"
[36mINFO[0m[0009] Applying label type=build-and-test
[36mINFO[0m[0009] Skipping push to container registry due to --no-push flag
/busybox/sh: cd: line 145: can't cd to /busybox: No such file or directory
[32;1m$ cd /[0;m
[32;1m$ pwd[0;m
/
[32;1m$ cd /busybox[0;m
section_end:1612649590:build_script
[0Ksection_start:1612649590:after_script
[0K[0K[36;1mRunning after_script[0;m
[0;mtime="2021-02-06T22:12:28Z" level=error msg="exec failed: container_linux.go:349: starting container process caused \"exec: \\\"sh\\\": executable file not found in $PATH\""
exec failed: container_linux.go:349: starting container process caused "exec: \"sh\": executable file not found in $PATH"
section_end:1612649590:after_script
[0Ksection_start:1612649590:upload_artifacts_on_failure
[0K[0K[36;1mUploading artifacts for failed job[0;m
[0;msection_end:1612649590:upload_artifacts_on_failure
[0K[31;1mERROR: Job failed: command terminated with exit code 2
[0;m
With commands like ls and cp no longer functioning due to the missing busybox directory, build artifacts cannot be copied back to the context directory path, which is where they must be located for Gitlab to be able to store them using its artifacts functionality.
Additional Information
- Dockerfile
FROM node:14.15-alpine3.12 AS base
LABEL type="build"
FROM base as build-and-test
LABEL type="build-and-test"
- Build Context Nothing local required.
- Kaniko Image (fully qualified with digest) gcr.io/kaniko-project/executor@sha256:473d6dfb011c69f32192e668d86a47c0235791e7e857c870ad70c5e86ec07e8c
Triage Notes for the Maintainers
| Description | Yes/No |
|---|---|
| Please check if this a new feature you are proposing |
|
| Please check if the build works in docker but not in kaniko |
|
Please check if this error is seen when you use --cache flag |
|
| Please check if your dockerfile is a multistage dockerfile |
|
@rcollette Did you find any workaround for this problem?
@qalinn - I have not found a workaround.
@qalinn The workaround I have used looks like:
Gitlab build job
# BUILD
build:
stage: build
interruptible: true
extends: .kubernetes_runners
<<: *except_master_and_prodfix
variables:
AWS_ACCESS_KEY_ID: $DEV_AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY: $DEV_AWS_SECRET_ACCESS_KEY
image:
name: gcr.io/kaniko-project/executor:$KANIKO_EXECUTOR_VERSION
entrypoint: [ "sh" ]
script:
# We cannot git merge from master here because busybox used in Kaniko does not have git nor does it have
# a package installer.
# Docker command is not available in Kaniko image so we have to create the .docker/config.json file manually.
- echo "$DOCKER_AUTH_CONFIG" > /kaniko/.docker/config.json
#This builds and image but does not push to the registry
- /kaniko/executor
--context $CI_PROJECT_DIR
--no-push
--skip-unused-stages=true
--cache=true
--cache-repo=${CI_REGISTRY_IMAGE}/cache
--log-timestamp=true
--log-format=text
--target build-and-test
# Copy artifacts from the root where the Dockerfile copied them, to the project dir because artifacts
# can only be captured from there.
- cp -R /reports $CI_PROJECT_DIR
artifacts:
when: always
expire_in: 30 days
expose_as: Code coverage report
paths:
- reports/coverage/index.html
- reports/coverage
reports:
coverage_report:
coverage_format: cobertura
path: reports/coverage/Cobertura.xml
junit:
- reports/unit-tests/*test-result.xml
Dockerfile - The build and test stage moves the reports folder to the root. This is the key to preservation.
#With restore completed, now copy over the entire application.
FROM restore as build-and-test
LABEL type="build"
ARG SOLUTION_NAME="Precisely.Pdx.Api"
ARG VERSION="0.0.0"
WORKDIR /app_build/$SOLUTION_NAME
COPY $SOLUTION_NAME .
# We still want to capture coverage reports even if there was a coverage threshold or test error
# Any artifacts we want to capture have to be moved to root.
# The kaniko working directory is removed when kaniko finishes working.
RUN ./coverage.sh || flag=1 ; \
mv reports / ; \
exit $flag
RUN dotnet publish $SOLUTION_NAME.Web --output /dist --configuration Release --no-restore /p:Version=$VERSION
This solution works for us because, though the files are copied to the root of the gitlab runner instance and that might be unnerving, the runner is an ephemeral kubernetes pod so it's not going to conflict with other build jobs.
It would be a little more straight forward if the build context was preserved perhaps in a known/controlled location and perhaps by using a CLI option so as not to break existing behavior.
Is there any news on this? i ran into it with a three stage build (common setup, build, final image) where i need to copy the context in the second stage. So the above workaround does not work for me.
If you're still looking for a solution to this you could give my fork a try https://github.com/mzihlmann/kaniko/releases/ It fixes this issue and a few more, mostly related to caching, if you have other issues you would like to see resolved please let me know. I know that this is not ideal and I hope we can get the changes merged here eventually but for now that's the best I can offer. If you like what you see you can support me with a star, thank you 🙇