containers-roadmap icon indicating copy to clipboard operation
containers-roadmap copied to clipboard

[ECR] [request]: support cache manifest

Open lifeofguenter opened this issue 4 years ago • 37 comments

Would be great if ECR could support cache-manifest (see: https://medium.com/titansoft-engineering/docker-build-cache-sharing-on-multi-hosts-with-buildkit-and-buildx-eb8f7005918e)

lifeofguenter avatar May 05 '20 12:05 lifeofguenter

BuildKit 0.8 will default to using an OCI media type for its caches (see https://github.com/moby/buildkit/pull/1746) which I assume should make this work, but I haven't tested it myself.

TBBle avatar Oct 27 '20 02:10 TBBle

It still doesn't work with recently released buildkit 0.8.0 It can write the layers and config, but it is unable to upload manifest to ECR:

=> ERROR exporting cache                                                                                                     5.4s
 => => preparing build cache for export                                                                                       0.2s
 => => writing layer sha256:0d48cc65d93fe2ee9877959ff98ebc98b95fe4b2fc467ff50f27103c1c5d6973                                  0.3s
 => => writing layer sha256:2ade286d53f2e045413601ca0e3790de3792ea34abd3d025cd2cd9c3cb5231de                                  0.3s
 => => writing layer sha256:64befcf53942ba04c144cde468548885d497e238001e965e983e39eb947860c2                                  0.3s
 => => writing layer sha256:7415f0cbea8739c1bf353568b16ac74a9cfbc0b36327602e3a025abf919a38a6                                  0.3s
 => => writing layer sha256:76a1f73c618c30eb1b1d90cf043fe3f855a1cce922d1fb47458defd3dbe1c783                                  0.3s
 => => writing layer sha256:8674739c0ada3e834b816667d26dd185aa5ea089f33701f11a05b7be03f43026                                  0.3s
 => => writing layer sha256:9dc80bcd2805b2a441bd69bc9468df2e81994239e34879567bed7bdef6cb605d                                  0.3s
 => => writing layer sha256:cbdbe7a5bc2a134ca8ec91be58565ec07d037386d1f1d8385412d224deafca08                                  0.3s
 => => writing layer sha256:ce4e6de84945ab498f65d16920c9b801dfea3792871e44f89e6438e232a690b3                                  0.3s
 => => writing layer sha256:d46583c5d4c69b34cb46866838d68f53a38686dc7f2d1347ae0f252e8eb0ed4c                                  0.2s
 => => writing config sha256:33c76a0f8a74a06e461926d8a8d1845371c0cf9e86753db2483a4873aede8889                                 2.0s
 => => writing manifest sha256:0f69a7e6626f6a24a0a95ed915613ebdf9459280d4986879480d87e34849aea8                               0.6s
------
 > importing cache manifest from XXXXXXXXXXXX.dkr.ecr.us-west-2.amazonaws.com/test-repo:buildcache:
------
------
 > exporting cache:
------
error: failed to solve: rpc error: code = Unknown desc = error writing manifest blob: failed commit on ref "sha256:0f69a7e6626f6a24a0a95ed915613ebdf9459280d4986879480d87e34849aea8": unexpected status: 400 Bad Request

aleks-fofanov avatar Dec 03 '20 09:12 aleks-fofanov

I am seeing the same error on buildkit 0.8.0

even when setting oci-mediatypes explicitly to true: --export-cache type=registry,ref=${REPO}:buildcache,oci-mediatypes=true

 => ERROR exporting cache                                                                                                                                                                                                                                                                                                                                                                                                                                                                1.4s
 => => preparing build cache for export                                                                                                                                                                                                                                                                                                                                                                                                                                                  0.0s
 => => writing layer sha256:757d39990544d20fbebf7a88e29a5dd2bb6a4fdb116d67df9fe8056843da794d                                                                                                                                                                                                                                                                                                                                                                                             0.1s
 => => writing layer sha256:7597eaba0060104f2bd4f3c46f0050fcf6df83066870767af41c2d7696bb33b2                                                                                                                                                                                                                                                                                                                                                                                             0.1s
 => => writing config sha256:0e308fd4eee4cae672eee133cbd77ef7c197fa5d587110b59350a99b289f7000                                                                                                                                                                                                                                                                                                                                                                                            0.8s
 => => writing manifest sha256:8eb142b16e0ec25db4517f2aecff795cca2b1adbe07c32f5c571efc5c808cbcd                                                                                                                                                                                                                                                                                                                                                                                          0.3s
------
 > importing cache manifest from xxx.dkr.ecr.us-east-1.amazonaws.com/errm/test:buildcache:
------
------
 > exporting cache:
------
error: failed to solve: rpc error: code = Unknown desc = error writing manifest blob: failed commit on ref "sha256:8eb142b16e0ec25db4517f2aecff795cca2b1adbe07c32f5c571efc5c808cbcd": unexpected status: 400 Bad Request

Deamon logs:

time="2020-12-09T13:42:48Z" level=info msg="running server on /run/buildkit/buildkitd.sock"
time="2020-12-09T13:44:09Z" level=warning msg="reference for unknown type: application/vnd.buildkit.cacheconfig.v0"
time="2020-12-09T13:44:10Z" level=error msg="/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = error writing manifest blob: failed commit on ref \"sha256:8eb142b16e0ec25db4517f2aecff795cca2b1adbe07c32f5c571efc5c808cbcd\": unexpected status: 400 Bad Request\n"

errm avatar Dec 09 '20 13:12 errm

Also seeing this for private repos, although it doesn't seem to be an issue with public ECR repos..

AlexLast avatar Jan 04 '21 23:01 AlexLast

Is there a timeframe for this feature request available? This could help to tremendously speed up CI builds.

n1ru4l avatar Feb 08 '21 07:02 n1ru4l

We have been experimenting with this buildkit feature for some time now and it works wonders. currently, we are still dependant upon dockehub so having this functionality in private ecr would greatly benefit our ci/cd workflow

jellevanhees avatar Mar 12 '21 09:03 jellevanhees

Any indication as to if/when this will ever be available? Using buildkit would really improve our CI build times

davidfm avatar Mar 16 '21 09:03 davidfm

One year passed and still nothing 😔

devopsmash avatar May 16 '21 16:05 devopsmash

We'd really like to see support of this with ECR private repos 🙏

As of today, it still does not work:

error: failed to solve: rpc error: code = Unknown desc = error writing manifest blob: failed commit on ref "sha256:75f32e1bb4df7c6333dc352ea3ea9d04d1e04e4a14ba79b59daa019074166519": unexpected status: 400 Bad Request

pieterza avatar Jun 01 '21 06:06 pieterza

Yes please!

hf avatar Jun 13 '21 15:06 hf

Can we get any kind of communications on this?

abatilo avatar Aug 21 '21 19:08 abatilo

Is there any workaround available?

renannprado avatar Nov 02 '21 11:11 renannprado

For the teams using Github but wishing to keep images in ECR, it is possible to leverage the cache manifest support from Github Container Registry (GHCR) and push the image to ECR at the same time. When pushing to ECR, only new layers get pushed.

Github Actions workflow example:

jobs:

  docker_build:
    strategy:
      matrix:
        name:
          - my-image
        include:
          - name: my-image
            registry_ecr: my-aws-account-id.dkr.ecr.us-east-1.amazonaws.com
            registry_ghcr: ghcr.io/my-github-org-name
            dockerfile: ./path/to/Dockerfile
            context: .
            extra_args: ''

    steps:
      - uses: actions/checkout@v2

      - name: Install Buildkit
        uses: docker/setup-buildx-action@v1
        id: buildx
        with:
          install: true

      - name: Login to GitHub Container Registry
        uses: docker/login-action@v1
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1
          role-skip-session-tagging: true
          role-duration-seconds: 1800
          role-session-name: GithubActionsBuildDockerImages

      - name: Login to Amazon ECR
        uses: aws-actions/amazon-ecr-login@v1

      - name: Build & Push (ECR)
        # - https://docs.docker.com/engine/reference/commandline/buildx_build/
        # - https://github.com/moby/buildkit#export-cache
        run: |
          docker buildx build \
            --cache-from=type=registry,ref=${{ matrix.registry_ghcr }}/${{ matrix.name }}:cache \
            --cache-to=type=registry,ref=${{ matrix.registry_ghcr }}/${{ matrix.name }}:cache,mode=max \
            --push \
            ${{ matrix.extra_args }} \
            -f ${{ matrix.dockerfile }} \
            -t ${{ matrix.registry_ecr }}/${{ matrix.name }}:${{ github.sha }} \
            ${{ matrix.context }}

ynouri avatar Nov 02 '21 13:11 ynouri

@ynouri Just be careful of your storage costs in GHCR. It's oddly expensive. I found https://github.com/snok/container-retention-policy to help solve that use case for me.

abatilo avatar Nov 03 '21 15:11 abatilo

ECR still not supporting this is unbelievably amateurish, it doesn't suit AWS...

kgns avatar Nov 03 '21 21:11 kgns

Is there any workaround available?

Use another Docker registry. Dockerhub, or perhaps your own tiny EC2 with some fat storage. Sucks, but AWS doesn't seem interested.

pieterza avatar Nov 05 '21 06:11 pieterza

This seems to have started working unannounced, at least when using docker 20.10.11 to build

poldridge avatar Dec 16 '21 16:12 poldridge

This seems to have started working unannounced, at least when using docker 20.10.11 to build

I'm still seeing error writing manifest blob with 400 Bad Request on Docker 5:20.10.12~3-0~ubuntu-focal, at least in us-west-2.

ramosbugs avatar Dec 16 '21 23:12 ramosbugs

This seems to have started working unannounced, at least when using docker 20.10.11 to build

is this confirmed?

kgns avatar Dec 19 '21 01:12 kgns

This seems to have started working unannounced, at least when using docker 20.10.11 to build

is this confirmed?

I'm wondering the same thing.

Could you share some more info @poldridge ?

BeyondEvil avatar Dec 23 '21 13:12 BeyondEvil

I've just faced the same issue with Docker version 20.10.12, build e91ed57. Would appreciate any hints or workarounds.

eduard-malakhov avatar Dec 27 '21 20:12 eduard-malakhov

This seems to have started working unannounced, at least when using docker 20.10.11 to build

Did not work for me using docker:20.10.11-dind and ECR us-west-2.

sherifabdlnaby avatar Jan 09 '22 00:01 sherifabdlnaby

Can we get any kind of communication on this? Being able to use remote cache is a major benefit to all our build pipelines.

sherifabdlnaby avatar Jan 09 '22 00:01 sherifabdlnaby

I am also super intrigued on a field report of what progress has occurred and what supported aspect of OCI layer caching are supported in ECR right now

diclophis avatar Feb 09 '22 21:02 diclophis

Do we have any update on this? Can we get any kind of response from AWS?

ayk33 avatar Feb 17 '22 16:02 ayk33

Would also like to know when this will be available

pieterza avatar Feb 17 '22 16:02 pieterza

Just stumbled on this during our migration to AWS. This kind of sucks as it break our pipeline logic ... Would be interested also for an ETA for this feature

erebe avatar Mar 16 '22 14:03 erebe

We gave up a while back and threw up our own (ALB + EC2 + S3) registry - setup was pretty quick. We finally got around to trying it and it appears to work great. We're still storing/pulling images in ECR for ECS; we only use our registry for the cache.

https://docs.docker.com/registry/deploying/

Still need to look into automatically cleaning up old images...

hlarsen avatar Mar 16 '22 14:03 hlarsen

Waiting on this very important feature request. This is blocking our migration to Graviton instances as multi-arch build caching not working without this one and causing the build to take too much time for completion.

chavan-suraj avatar May 06 '22 12:05 chavan-suraj

We have started looking into the technical approaches & feasibility to support this.

arunsollet avatar May 06 '22 20:05 arunsollet