addons icon indicating copy to clipboard operation
addons copied to clipboard

[Bug]: Intermittent CI failures related to docker mount points

Open diox opened this issue 1 year ago • 3 comments

What happened?

CI intermittently fails with

 > [pip_development 1/1] RUN     --mount=type=bind,source=./requirements/prod.txt,target=/data/olympia/requirements/prod.txt     --mount=type=bind,source=./requirements/dev.txt,target=/data/olympia/requirements/dev.txt     --mount=type=bind,source=package.json,target=/data/olympia/package.json     --mount=type=bind,source=package-lock.json,target=/data/olympia/package-lock.json     --mount=type=cache,target=/deps/cache/,uid=9500,gid=9500     --mount=type=cache,target=/deps/cache/npm,uid=9500,gid=9500 <<EOF (python3 -m pip install --progress-bar=off --no-deps --exists-action=w -r requirements/prod.txt...):
0.118 runc run failed: unable to start container process: error during container init: error mounting "/var/lib/docker/tmp/buildkit-mount3294243076" to rootfs at "/deps/cache/npm": create mount destination for /deps/cache/npm mount: mkdirat /var/lib/docker/buildkit/executor/iks2wxgn6qo73mdbex7pr43w7/rootfs/deps/cache/npm: file exists
------
Dockerfile:115
--------------------
 114 |     
 115 | >>> RUN \
 116 | >>>     # Files required to install pip dependencies
 117 | >>>     --mount=type=bind,source=./requirements/prod.txt,target=${HOME}/requirements/prod.txt \
 118 | >>>     --mount=type=bind,source=./requirements/dev.txt,target=${HOME}/requirements/dev.txt \
 119 | >>>     # Files required to install npm dependencies
 120 | >>>     --mount=type=bind,source=package.json,target=${HOME}/package.json \
 121 | >>>     --mount=type=bind,source=package-lock.json,target=${HOME}/package-lock.json \
 122 | >>>     # Mounts for caching dependencies
 123 | >>>     --mount=type=cache,target=${PIP_CACHE_DIR},uid=${OLYMPIA_UID},gid=${OLYMPIA_UID} \
 124 | >>>     --mount=type=cache,target=${NPM_CACHE_DIR},uid=${OLYMPIA_UID},gid=${OLYMPIA_UID} \
 125 | >>> <<EOF
 126 | >>> ${PIP_COMMAND} install --progress-bar=off --no-deps --exists-action=w -r requirements/prod.txt
 127 | >>> ${PIP_COMMAND} install --progress-bar=off --no-deps --exists-action=w -r requirements/dev.txt
 128 | >>> npm install ${NPM_ARGS} --no-save
 129 | >>> EOF
 130 |     
--------------------
ERROR: failed to solve: process "/bin/bash -xue -c ${PIP_COMMAND} install --progress-bar=off --no-deps --exists-action=w -r requirements/prod.txt\n${PIP_COMMAND} install --progress-bar=off --no-deps --exists-action=w -r requirements/dev.txt\nnpm install ${NPM_ARGS} --no-save\n" did not complete successfully: exit code: 1
896  /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
github.com/moby/buildkit/executor/runcexecutor.exitError
	/root/build-deb/engine/vendor/github.com/moby/buildkit/executor/runcexecutor/executor.go:374
github.com/moby/buildkit/executor/runcexecutor.(*runcExecutor).Run
	/root/build-deb/engine/vendor/github.com/moby/buildkit/executor/runcexecutor/executor.go:335
github.com/moby/buildkit/solver/llbsolver/ops.(*ExecOp).Exec
	/root/build-deb/engine/vendor/github.com/moby/buildkit/solver/llbsolver/ops/exec.go:479
github.com/moby/buildkit/solver.(*sharedOp).Exec.func2
	/root/build-deb/engine/vendor/github.com/moby/buildkit/solver/jobs.go:975
github.com/moby/buildkit/util/flightcontrol.(*call[...]).run
	/root/build-deb/engine/vendor/github.com/moby/buildkit/util/flightcontrol/flightcontrol.go:121
sync.(*Once).doSlow
	/usr/local/go/src/sync/once.go:74
sync.(*Once).Do
	/usr/local/go/src/sync/once.go:65
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1650

1798 v0.17.1 /usr/libexec/docker/cli-plugins/docker-buildx buildx bake --file docker-bake.hcl --file .env --progress auto --metadata-file buildx-bake-metadata.json
google.golang.org/grpc.(*ClientConn).Invoke
	google.golang.org/[email protected]/call.go:35
github.com/moby/buildkit/api/services/control.(*controlClient).Solve
	github.com/moby/[email protected]/api/services/control/control.pb.go:2261
github.com/moby/buildkit/client.(*Client).solve.func2
	github.com/moby/[email protected]/client/solve.go:269
golang.org/x/sync/errgroup.(*Group).Go.func1
	golang.org/x/[email protected]/errgroup/errgroup.go:78
runtime.goexit
	runtime/asm_amd64.s:1695

896  /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
github.com/moby/buildkit/solver.(*edge).execOp
	/root/build-deb/engine/vendor/github.com/moby/buildkit/solver/edge.go:937
github.com/moby/buildkit/solver/internal/pipe.NewWithFunction.func2
	/root/build-deb/engine/vendor/github.com/moby/buildkit/solver/internal/pipe/pipe.go:82
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1650

1798 v0.17.1 /usr/libexec/docker/cli-plugins/docker-buildx buildx bake --file docker-bake.hcl --file .env --progress auto --metadata-file buildx-bake-metadata.json
github.com/moby/buildkit/client.(*Client).solve.func2
	github.com/moby/[email protected]/client/solve.go:285
golang.org/x/sync/errgroup.(*Group).Go.func1
	golang.org/x/[email protected]/errgroup/errgroup.go:78

896  /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
github.com/moby/buildkit/solver/llbsolver/ops.(*ExecOp).Exec
	/root/build-deb/engine/vendor/github.com/moby/buildkit/solver/llbsolver/ops/exec.go:500
github.com/moby/buildkit/solver.(*sharedOp).Exec.func2
	/root/build-deb/engine/vendor/github.com/moby/buildkit/solver/jobs.go:975

make[1]: *** [Makefile-os:104: docker_build_web] Error 1
make: *** [Makefile-os:119: docker_pull_or_build] Error 2
make[1]: Leaving directory '/home/runner/work/addons-server/addons-server'
Error: Process completed with exit code 2.

What did you expect to happen?

No failures :)

Is there an existing issue for this?

  • [X] I have searched the existing issues

┆Issue is synchronized with this Jira Task

diox avatar Sep 20 '24 13:09 diox

On Matrix @KevinMind said:

I think I know what happened. pip_development and pip_production now can run in parallel, which makes the build faster.

But it creates a race condition because each is attempting to create a cache mount to the same path in the image during the build.

diox avatar Sep 20 '24 13:09 diox

  • How slow would it be to re-introduce non-parallel pip builds ?
  • Could we avoid installing dev stuff like that ? We need it for assets (see also https://github.com/mozilla/addons/issues/2000)
  • Installing dependencies in parallel overall (with pip or uv)

diox avatar Sep 24 '24 14:09 diox

We'll see how often this comes up to make a decision.

diox avatar Sep 24 '24 14:09 diox