Docker build fails to push when switching Docker Hub accounts
Summary
If you change docker hub accounts in deploy.yml after successfully deploying, kamal deploy fails.
Running kamal build remove and then re-attempting kamal deploy solves the issue.
Details
Repro
- Have these settings in
deploy.yml, withKAMAL_REGISTRY_PASSWORDset using.kamal/secretsimage: account1/myrepo ... registry: username: account1 password: - KAMAL_REGISTRY_PASSWORD - Deploy successfully
- Change
deploy.ymlto useaccount2, and update.kamal/secretsaccordinglyimage: account2/myrepo ... registry: username: account2 password: - KAMAL_REGISTRY_PASSWORD - Deploy
Expected: Successful deploy using the new dockerhub account
Actual: Docker push failure of insufficient_scope: authorization failed
You have to run kamal build remove (which runs docker buildx rm kamal-local-docker-container) to fix the problem.
More info
See discussion here with other people running into this.
Example logs: (with account & repo names swapped out)
...
INFO [9df05177] Running docker buildx build --push --platform linux/amd64 --builder kamal-local-docker-container -t account2/myrepo:7f676021e2ef67889b478b8f615f4cb73960eab6 -t account2/myrepo:latest --label service="myrepo" --file Dockerfile . as jordan@localhost
DEBUG [9df05177] Command: docker buildx build --push --platform linux/amd64 --builder kamal-local-docker-container -t account2/myrepo:7f676021e2ef67889b478b8f615f4cb73960eab6 -t account2/myrepo:latest --label service="myrepo" --file Dockerfile .
...
#20 [stage-2 3/3] RUN groupadd --system --gid 1000 rails && useradd rails --uid 1000 --gid 1000 --create-home --shell /bin/bash && chown -R rails:rails db log storage tmp
#20 CACHED
#21 [auth] account2/myrepo:pull,push token for registry-1.docker.io
#21 DONE 0.0s
#22 exporting to image
#22 exporting layers done
#22 exporting manifest sha256:5e9e4ac8999501811b49b40f0f8ca995be8178fb7fc18e977d804b15e45c53ce done
#22 exporting config sha256:62a8bd977a1b96f9eec7cf8725d4be3b303b6634f97cde837a654bc6eff19b06 done
#22 exporting attestation manifest sha256:ca779fd9c370e391eaffc53df25c4de605409db649014ce7ddf8d3e21c273a60 done
#22 exporting manifest list sha256:2c4079e386869f8360eaf703586b8bcb79d365790d328230d1188a237a521c33 done
#22 pushing layers
#22 ...
#23 [auth] account2/myrepo:pull,push account1/myrepo:pull token for registry-1.docker.io
#23 DONE 0.0s
#22 exporting to image
#22 ...
#24 [auth] account2/myrepo:pull,push account1/myrepo:pull token for registry-1.docker.io
#24 DONE 0.0s
#22 exporting to image
#22 pushing layers 0.9s done
#22 ERROR: failed to push account2/myrepo:7f676021e2ef67889b478b8f615f4cb73960eab6: server message: insufficient_scope: authorization failed
👆 note the presence of account1 even though deploy.yml has been fully updated to account2.
Running kamal build remove and then retrying kamal deploy fixes the problem.
This is also happening to me with Digital Ocean registry, but its worse, since its from 2 completely different projects. It is trying to log into the other project registry account. Even on its first build.
Hmm maybe need to create per account builders here
or simply pass an option to add a namespace to the builder?
builder: namespace: projectname
or simply pass an option to add a namespace to the builder?
builder: namespace: projectname
There is no such attrbiute called namespace. Where did you find it?
This is a real problem. I cannot switch between projects and deploy them easily. I must either use remove build (I dont, servers dont have reserve resources for builds) OR I must delete whole builx container with app cache so next time my build goes from scratch.
@alhafoudh i agree, it's a real issue, i don't know how can i help here, @djmb how do you think we can solve this issue?
@alhafoudh i agree, it's a real issue, i don't know how can i help here, @djmb how do you think we can solve this issue?
Naming/prefixing the buildx container would solve the issue.
Quick fix for people having this issue:
Put this at the top of deploy.yml:
<%
class Kamal::Commands::Builder::Local < Kamal::Commands::Builder::Base
private
def builder_name
"kamal-local-your-app-#{driver}"
end
end
%>
service: your-app
...
I've not been able to reproduce this - maybe this is caused by something in the Dockerfile that causes the credentials to end up on the builder cache? The report here looks very similar to https://github.com/docker/buildx/issues/2364#issuecomment-2615504063, though I guess there are multiple ways this could go wrong if stuff is cached in the build container.
For anyone having this issue, does running docker buildx prune --builder kamal-local-docker-container also fix the issue?
We could include the registry and user in the build name. That would avoid duplicate builders if you had multiple projects using the same credentials.
It would mean orphaned builders getting left behind if you change the registry/user without removing the old one first, but I think that's unavoidable. Could maybe add an -all parameter to kamal build remove that removes all kamal builders?
Ran into the same issue, pruning the builder did help me. So I use the workaround from @alhafoudh just slipping in the app-name to be safe. I think having one builder per app is generally a good idea to seperate all concerns. It would be great to make this the default behavior.