buildkit-cache-dance icon indicating copy to clipboard operation
buildkit-cache-dance copied to clipboard

Stored cache is empty

Open sylver opened this issue 1 year ago • 19 comments
trafficstars

I am probably missing something, but doing the same as the example results in an empty cache :

      - name: Cache apk
        uses: actions/cache@v4
        id: cache-apk
        with:
          path: |
            var-cache-apk
          key: ${{ runner.os }}-apk-cache-${{ hashFiles('Dockerfile') }}
          save-always: true
          restore-keys: |
            ${{ runner.os }}-apk-cache-

      - name: Inject apk cache into Docker
        uses: reproducible-containers/[email protected]
        with:
          cache-map: |
            {
              "var-cache-apk": {
                "target": "/var/cache/apk",
                "id": "apk-cache"
              }
            }
          save-always: true
          skip-extraction: ${{ steps.cache-apk.outputs.cache-hit }}
Screenshot 2024-06-02 at 18 18 38

Dockerfile snippet using this cache :

RUN --mount=type=cache,id=apk-cache,target=/var/cache/apk \
<<EOT
set -e

echo "@edge http://dl-cdn.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories

apk update
apk upgrade
apk add ${PACKAGES}
EOT

I have the same issue with other caches I tried (pnpm store, build dist folders, etc), so that's not related specifically to this path but to the way I implement the action (according to the documentation).

sylver avatar Jun 02 '24 16:06 sylver

same here.

yunhaoww avatar Jun 13 '24 15:06 yunhaoww

I cannot reproduce this, but there are a few reasons why this can happen:

  • buildkit-cache-dance may use a different buildx builder than your actual Docker build.
  • Options to --mount=type=cache are different. Using a different id, uid, gid or mode will result in getting a new empty cache. This doesn't seem to be the case here, though.

rkarp avatar Aug 13 '24 14:08 rkarp

I'm also experiencing this issue. Some relevant code...

Relevant parts of GitHub Actions Workflow:

      - run: docker context create builders

      - uses: docker/setup-buildx-action@d70bba72b1f3fd22344832f00baa16ece964efeb # [email protected]
        with:
          endpoint: builders

      - name: Ensure cache directories exists
        run: |
          mkdir -p ~/.nuget/packages
          mkdir -p ~/dist

      - name: 'Cache intermediate build artifacts'
        id: cache
        uses: actions/cache@13aacd865c20de90d75de3b17ebe84f7a17d57d2 # pin@v4
        with:
          path: |
            ~/.nuget/packages
            ~/dist
          key: ${{ runner.os }}-intermediate-${{ matrix.project }}-${{ hashFiles('src/**/Directory.Packages.props') }}

      - name: Show cache contents
        run: |
          ls -al ~/.nuget/packages
          ls -al ~/dist

      - name: inject cache into docker build
        uses: reproducible-containers/buildkit-cache-dance@5b6db76d1da5c8b307d5d2e0706d266521b710de # v3.1.2
        with:
          cache-map: |
            {
              "~/.nuget/packages": {
                "target": "/root/.nuget/packages",
                "id": "nuget"
              },
              "~/dist": {
                "target": "/dist",
                "id": "dist"
              }
            }
          skip-extraction: ${{ steps.cache.outputs.cache-hit }}

      - name: Build and push
        uses: docker/[email protected]
        with:
          context: ./src
          cache-from: type=gha
          cache-to: type=gha,mode=max
          file: src/Location/ZapMiddleware.Location.Api/Dockerfile
          push: false
          tags: test

Relevant parts of Dockerfile:

###################################################
#                   Build image                   #
###################################################
FROM mcr.microsoft.com/dotnet/sdk:8.0.201-bookworm-slim AS build
WORKDIR /src

# Check cache mounts
RUN --mount=type=cache,id=nuget,target=/root/.nuget/packages ls -al /root/.nuget/packages
RUN --mount=type=cache,id=dist,target=/dist ls -al /dist

# Copy csproj and restore as distinct layers
COPY ["Location/ZapMiddleware.Location.Api/ZapMiddleware.Location.Api.csproj", "Location/ZapMiddleware.Location.Api/"]
COPY "Directory.Build.props" .
COPY "Directory.Packages.props" .
RUN --mount=type=cache,id=nuget,target=/root/.nuget/packages \
	--mount=type=cache,id=dist,target=/dist \
	dotnet restore "Location/ZapMiddleware.Location.Api"

# Check cache mounts
RUN --mount=type=cache,id=nuget,target=/root/.nuget/packages ls -al /root/.nuget/packages
RUN --mount=type=cache,id=dist,target=/dist ls -al /dist/intermediates/src

# Copy everything else and build app
COPY . .
WORKDIR "/src/Location/ZapMiddleware.Location.Api"
RUN --mount=type=cache,id=nuget,target=/root/.nuget/packages \
	--mount=type=cache,id=dist,target=/dist \
	dotnet publish -c Release -o /app
RUN rm /app/appsettings*.json

# Check cache mounts
RUN --mount=type=cache,id=nuget,target=/root/.nuget/packages ls -al /root/.nuget/packages
RUN --mount=type=cache,id=dist,target=/dist ls -al /dist/intermediates/src
image

Relevant parts of docker build logs: The cache directories are populated with the expected package and build stage artefacts

...
2024-08-31T08:17:01.3530775Z #14 [build  2/19] WORKDIR /src
2024-08-31T08:17:01.3532385Z #14 extracting sha256:c7c43fea98428ca37da3bb3c9e267aba534255e5e587f604811e51ff3adf99a6 0.0s done
2024-08-31T08:17:01.3533141Z #14 DONE 10.6s
2024-08-31T08:17:01.3534565Z #15 [build  3/19] RUN --mount=type=cache,id=nuget,target=/root/.nuget/packages ls -al /root/.nuget/packages
2024-08-31T08:17:01.3535742Z #15 0.063 total 12
2024-08-31T08:17:01.3536341Z #15 0.063 drwxr-xr-x 2 root root 4096 Aug 31 08:16 .
2024-08-31T08:17:01.3537024Z #15 0.063 drwxr-xr-x 1 root root 4096 Aug 31 08:17 ..
2024-08-31T08:17:01.3537760Z #15 0.063 -rw-r--r-- 1 root root   24 Aug 31 08:16 buildstamp
2024-08-31T08:17:08.1395103Z #15 DONE 7.0s
2024-08-31T08:17:08.2698944Z #16 [build  4/19] RUN --mount=type=cache,id=dist,target=/dist ls -al /dist
2024-08-31T08:17:08.2700519Z #16 0.070 total 12
2024-08-31T08:17:08.2701584Z #16 0.070 drwxr-xr-x 2 root root 4096 Aug 31 08:16 .
2024-08-31T08:17:08.2702859Z #16 0.070 drwxr-xr-x 1 root root 4096 Aug 31 08:17 ..
2024-08-31T08:17:08.2704233Z #16 0.070 -rw-r--r-- 1 root root   24 Aug 31 08:16 buildstamp
2024-08-31T08:17:08.2705144Z #16 DONE 0.1s
2024-08-31T08:17:08.2706732Z #17 [build  5/19] COPY [Location/ZapMiddleware.Location.Api/ZapMiddleware.Location.Api.csproj, Location/ZapMiddleware.Location.Api/]
2024-08-31T08:17:08.2708241Z #17 DONE 0.0s
2024-08-31T08:17:08.5006252Z #18 [build  6/19] COPY Directory.Build.props .
2024-08-31T08:17:08.5007137Z #18 DONE 0.0s
2024-08-31T08:17:08.5007706Z #19 [build  7/19] COPY Directory.Packages.props .
2024-08-31T08:17:08.5008310Z #19 DONE 0.0s
2024-08-31T08:17:08.5010318Z #20 [build  8/19] RUN --mount=type=cache,id=nuget,target=/root/.nuget/packages 	--mount=type=cache,id=dist,target=/dist 	dotnet restore "Location/ZapMiddleware.Location.Api"
2024-08-31T08:17:09.1028290Z #20 0.753   Determining projects to restore...
...
2024-08-31T08:17:14.7344732Z #20 6.384   Restored /src/Location/ZapMiddleware.Location.Api/ZapMiddleware.Location.Api.csproj (in 5.24 sec).
2024-08-31T08:17:15.4331872Z #20 DONE 7.1s
2024-08-31T08:17:15.5401299Z #21 [build  9/19] RUN --mount=type=cache,id=nuget,target=/root/.nuget/packages ls -al /root/.nuget/packages
2024-08-31T08:17:15.5402742Z #21 0.076 total 164
2024-08-31T08:17:15.5403791Z #21 0.076 drwxr-xr-x 40 root root 4096 Aug 31 08:17 .
2024-08-31T08:17:15.5405044Z #21 0.076 drwxr-xr-x  1 root root 4096 Aug 31 08:17 ..
2024-08-31T08:17:15.5406376Z #21 0.076 -rw-r--r--  1 root root   24 Aug 31 08:16 buildstamp
2024-08-31T08:17:15.5407828Z #21 0.076 drwxr-xr-x  3 root root 4096 Aug 31 08:17 humanizer.core
2024-08-31T08:17:15.5409709Z #21 0.076 drwxr-xr-x  3 root root 4096 Aug 31 08:17 microsoft.bcl.asyncinterfaces
...
2024-08-31T08:17:15.5459341Z #21 DONE 0.1s
2024-08-31T08:17:15.7726542Z #22 [build 10/19] RUN --mount=type=cache,id=dist,target=/dist ls -al /dist/intermediates/src
2024-08-31T08:17:15.7727581Z #22 0.067 total 12
2024-08-31T08:17:15.7728241Z #22 0.067 drwxr-xr-x 3 root root 4096 Aug 31 08:17 .
2024-08-31T08:17:15.7728995Z #22 0.067 drwxr-xr-x 3 root root 4096 Aug 31 08:17 ..
2024-08-31T08:17:15.7730578Z #22 0.067 drwxr-xr-x 3 root root 4096 Aug 31 08:17 Location
2024-08-31T08:17:15.7731171Z #22 DONE 0.1s
2024-08-31T08:17:15.7731589Z #23 [build 11/19] COPY . .
2024-08-31T08:17:19.8166214Z #23 DONE 4.2s
2024-08-31T08:17:19.9693387Z #24 [build 12/19] WORKDIR /src/Location/ZapMiddleware.Location.Api
2024-08-31T08:17:20.0403629Z #24 DONE 0.2s
2024-08-31T08:17:20.1930552Z #25 [build 13/19] RUN --mount=type=cache,id=nuget,target=/root/.nuget/packages 	--mount=type=cache,id=dist,target=/dist 	dotnet publish -c Release -o /app
2024-08-31T08:17:20.2861940Z #25 0.243 MSBuild version 17.9.4+90725d08d for .NET
2024-08-31T08:17:20.8896105Z #25 0.847   Determining projects to restore...
...
2024-08-31T08:18:07.1248271Z #25 DONE 47.1s
2024-08-31T08:18:07.3202472Z #26 [build 14/19] RUN rm /app/appsettings*.json
2024-08-31T08:18:07.3205622Z #26 DONE 0.1s
2024-08-31T08:18:07.3208466Z #27 [build 15/19] RUN --mount=type=cache,id=nuget,target=/root/.nuget/packages ls -al /root/.nuget/packages
2024-08-31T08:18:07.3209947Z #27 0.098 total 1128
2024-08-31T08:18:07.3210553Z #27 0.098 drwxr-xr-x 277 root root 20480 Aug 31 08:17 .
2024-08-31T08:18:07.3211282Z #27 0.098 drwxr-xr-x   1 root root  4096 Aug 31 08:18 ..
...
2024-08-31T08:18:07.3214067Z #27 0.098 drwxr-xr-x   3 root root  4096 Aug 31 08:17 asp.versioning.mvc
2024-08-31T08:18:07.3215010Z #27 0.098 drwxr-xr-x   3 root root  4096 Aug 31 08:17 asp.versioning.mvc.apiexplorer
2024-08-31T08:18:07.3215881Z #27 0.098 -rw-r--r--   1 root root    24 Aug 31 08:16 buildstamp
2024-08-31T08:18:07.3216753Z #27 0.098 drwxr-xr-x   3 root root  4096 Aug 31 08:17 communitytoolkit.diagnostics
...
2024-08-31T08:18:07.4769263Z #27 DONE 0.1s
2024-08-31T08:18:07.4770284Z #28 [build 16/19] RUN --mount=type=cache,id=dist,target=/dist ls -al /dist/intermediates/src
2024-08-31T08:18:07.4771487Z #28 0.088 total 28
2024-08-31T08:18:07.4772414Z #28 0.088 drwxr-xr-x  7 root root 4096 Aug 31 08:17 .
2024-08-31T08:18:07.4773753Z #28 0.088 drwxr-xr-x  3 root root 4096 Aug 31 08:17 ..
2024-08-31T08:18:07.4774639Z #28 0.088 drwxr-xr-x  6 root root 4096 Aug 31 08:17 Location
2024-08-31T08:18:07.4775681Z #28 0.088 drwxr-xr-x 23 root root 4096 Aug 31 08:17 Shared
2024-08-31T08:18:07.4776973Z #28 0.088 drwxr-xr-x  3 root root 4096 Aug 31 08:17 ZapMiddleware.AppCommon.Queues
2024-08-31T08:18:07.4778776Z #28 0.088 drwxr-xr-x  3 root root 4096 Aug 31 08:17 ZapMiddleware.Intl
2024-08-31T08:18:07.4780501Z #28 0.088 drwxr-xr-x  3 root root 4096 Aug 31 08:17 ZapMiddleware.Util
2024-08-31T08:18:07.5510474Z #28 DONE 0.1s

Extracting cache: Extraction seems to happen, but the resulting cache that gets uploaded to GHA is empty

image

mrfelton avatar Aug 31 '24 08:08 mrfelton

I resolved my issue by ensuring that the local directory doesn't exist before running the cache dance, and changing the path to something relative to the checkout.

mrfelton avatar Sep 02 '24 11:09 mrfelton

I'm also experiencing same issue. We daily create cache that is the packages installed by bundle install. About once every 3-4 days, the mount-cache cannot be extracted by buildkit-cache-dance/extract and so cache is saved as empty cache.

Relevant parts of GitHub Actions Workflow:

jobs:
  create-bundle-cache:
    timeout-minutes: 45
    runs-on:
      labels: Ubuntu-8core
    steps:
      - uses: actions/checkout@v4
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
      - name: docker-build-push-action
        uses: docker/[email protected]
        with:
          context: .
          file: Dockerfile
          platforms: linux/amd64
          target: production
          no-cache: true
          push: false
          provenance: false
          tags: |
            bundle-cache-main
      - name: mkdir for buildkit-cache-dance/extract
        run: |
          mkdir -p var-cache-bundle scratch
      - name: extract var-cache-bundle from build cache
        uses: reproducible-containers/buildkit-cache-dance/[email protected]
        with:
          cache-source: 'var-cache-bundle'
          cache-target: '/var/cache/bundle'
          scratch-dir: scratch

      # save local mount-cache to gha
      - name: Cache var-cache-bundle
        id: var-cache-bundle
        uses: actions/cache/save@v4
        with:
          path: var-cache-bundle
          key: var-cache-bundle-main-${{ github.run_id }}

kynefuk avatar Sep 25 '24 02:09 kynefuk

try disabling job summary generated by build-push-action

kkopachev avatar Jan 29 '25 21:01 kkopachev

I tried disabling job summary, and it had no effect.

Then I tried invalidating the layer cache, and all of a sudden it started working.

I always delete the cache from /actions/caches manually.

Hypothesis: If you try to introduce this action after a layer cache has been produced, you might get unexpected results?

Aside: Using a generic cache key with this action seems risky since it'll waste time running even when there's already a layer cache. The cache key should probably match however layer caching works to avoid this.

staticaland avatar Feb 03 '25 09:02 staticaland

I have the same kissue in an Azure DevOps pipeline. I've been able to have the cache extracted two or three times (over more than 40 attempt). The injection works well.

@mrfelton

I resolved my issue by ensuring that the local directory doesn't exist before running the cache dance, and changing the path to something relative to the checkout.

I'm sure that the local directory doesn't exist before running the cache. What do you mean with "changing the path to something relative to the checkout"? The path of what? Must the local folder be the same for injection and extraction? Or can I specify different folders in --cache-map?

escherstair avatar Mar 20 '25 15:03 escherstair

I add one thing that seems to be the reason for stored cache empty (as far as I can see now):

  • cache injection and cache extraction must not be used together:
    • if there is a cache miss (i.e., no cache exists), skip cache injection and do cache extraction at the end. The cache will be extracted properly populated
    • if there is a cache hit (i.e., cache already exists), do cache injection and skip cache extraction at the end. The cache will be properly injected

If you do cache injection, it seems that if you call cache extraction afterwards, the cache is extracted empty (or, maybe, equal to the cache injected?)

escherstair avatar Mar 24 '25 14:03 escherstair

Does anyone have a working fix for this?

anthonyalayo avatar May 08 '25 09:05 anthonyalayo

Hi @anthonyalayo I cannot answer for other people, but in my case, everything is ok using EITHER cache injection OR cache extraction, but not both of them together, as I wrote in my post above.

escherstair avatar May 08 '25 11:05 escherstair

@escherstair just to make sure, that is equivalent to the example on the readme? Using only skip-extraction: ${{ steps.cache-apk.outputs.cache-hit }}?

anthonyalayo avatar May 08 '25 17:05 anthonyalayo

I've fiddled with this a lot, and it's definitely not working. I've thrown print statements in my docker build that shows the next cache and pnpm cache populated (in the hundreds of megs).

Then I dumped the details of the copy in the post inject step:

Post job cleanup.
Cache extraction configuration:
Source: next-cache
Target: /app/.next/cache
Mount args: type=cache,target=/app/.next/cache
Dockerfile content:
FROM ghcr.io/containerd/busybox:latest
COPY buildstamp buildstamp
RUN --mount=type=cache,target=/app/.next/cache     echo "Contents of /app/.next/cache:" &&     ls -la /app/.next/cache &&     mkdir -p /var/dance-cache/next-cache     && cp -p -R /app/.next/cache/. /var/dance-cache/next-cache &&     echo "Contents of /var/dance-cache/next-cache:" &&     ls -la /var/dance-cache/next-cache
Starting cache extraction for source: next-cache
Target path: /app/.next/cache
Checking extracted content in scratch dir:
Extracted files: [ 'buildstamp' ]

@aminya @AkihiroSuda mind stepping in here? Did this break? I see people doing downgrades to v2.1.4: https://github.com/youtalk/autoware/pull/58/files

anthonyalayo avatar May 08 '25 20:05 anthonyalayo

I've just come across this issue, and managed to resolve it by downgrading to the old v2.1.4 revision. I have no clue what is actually the cause of the issue but since I only have 1 folder to cache and don't need the features present in v3 and beyond, this is an acceptable solution for me.

For reference, my workflow file: https://github.com/MxBlu/choretracker/blob/main/.github/workflows/vcpkg-build.yaml

MxBlu avatar May 13 '25 07:05 MxBlu

I've banged my head against some walls with a similar issue recently. In my case, even downgrading to v2.1.4 didn't help. It seems that the layer cache conflicts with cache mounts in some way (even if invalidated). The only thing that made it work was removing the cache-from and cache-to args (mine were set to type=registry and not GHA as in this issue and the README examples).

I tried to go a bit deeper on that - it looks like when using both layer cache and cache mounts, according to docker buildx du --filter type=exec.cachemount --verbose, the cache mounts are deleted and new ones are created when the buildkit-cache-dance container goes up.

It's kinda troubling both cache mechanisms can't work together properly, I hope at least my findings could help others.

psypuff avatar Jun 26 '25 12:06 psypuff

I was struggling to get it working— it would be helpful if it gave more insight at runtime into what it's doing, as this is a bit opaque/incomplete:

Post job cleanup.
FROM ghcr.io/containerd/busybox:latest
COPY buildstamp buildstamp
RUN --mount=type=cache,target=/ccache     mkdir -p /var/dance-cache/     && cp -p -R /ccache/. /var/dance-cache/ || true

It's clearly mounting my cache directory and copying the contents to /var/dance-cache, but past that point you have to look at the source to follow what is going on:

https://github.com/reproducible-containers/buildkit-cache-dance/blob/5b81f4d29dc8397a7d341dba3aeecc7ec54d6361/src/extract-cache.ts#L28-L46

At least for me, the crux of it was that the results are copied to the cacheSource location, which turns out to be the keys of the cache-map dict. So this ended up working for me:

      - uses: actions/cache@v4
        id: cache
        with:
          path: ccache-dir
          key: ccache-${{ matrix.build }}-${{ matrix.arch}}

      - name: Restore Docker cache mounts
        uses: reproducible-containers/buildkit-cache-dance@v3
        with:
          builder: ${{ steps.setup-buildx.outputs.name }}
          cache-map: '{ "ccache-dir": { "target": "/ccache", "id": "ccache" } }'
          save-always: true

Because ccache is a progressive cache that is self-managing in terms of size (and my cache is relatively small anyway), it's worth it to me to be updating on each run rather than on a schedule.

mikepurvis avatar Jul 27 '25 01:07 mikepurvis

@mrfelton , can you explain what you meant under:

  1. "ensuring that the local directory doesn't exist before running the cache dance" and
  2. "changing the path to something relative to the checkout"?

and share your setup that worked?

UPD: Nothing worked no matter what I tried until I reread @rkarp 's comment and set a builder for buildkit-cache-dance

      - name: Set up Docker Buildx
        id: setup-buildx
        uses: docker/setup-buildx-action@v3

.....

      - name: Inject rust-build-cache
        uses: reproducible-containers/buildkit-cache-dance@v3
        with:
          builder: ${{ steps.setup-buildx.outputs.name }}

StashOfCode avatar Aug 11 '25 10:08 StashOfCode

If the cache mount path in your Dockerfile has a trailing slash it will not work. Change:

RUN --mount=type=cache,target=/usr/src/app/target/

to

RUN --mount=type=cache,target=/usr/src/app/target

timakro avatar Aug 20 '25 11:08 timakro

The only thing that made it work was removing the cache-from and cache-to args (mine were set to type=registry and not GHA as in this issue and the README examples).

Mine is set to type=registry too but my issue is that the runner blows up due to space issues. We are running self hosted runners. Wondering how much of that impacts the ability for the buildx cache to be extracted and cached to GHA separately.

What I'm failing to find anywhere is whether my intended workflow is even possible:

  • restore cache (if exists)
  • start buildx runner (docker in docker)
  • load cache to buildx runner
  • start docker build
  • buildx mounts cache during dockerfile build by id with the --mount directive
  • build finishes
  • whole docker image saved to type=registry as cache for all docker layers
  • export buildx cache for the step that is using --mount to a folder
  • save buildx cache that was exported to gha cache

I just want the docker image cache to at least restore some part of the build because, in our case, R packages take 10 minutes to build. That being invalidated with no fallback if anything in the dockerfile changes is a slog.

SB-MFJ avatar Aug 28 '25 21:08 SB-MFJ