Deploying a Flow with Docker Desktop in Europe raises BuildError: failed to export image: NotFound: content digest
Bug summary
Trying to create a deployment with a Docker Image on MacOS with Docker Desktop fails with the following error, but only if you are located in Europe:
File "/Users/zidar/programming/app/.venv/lib/python3.12/site-packages/prefect/utilities/asyncutils.py", line 399, in coroutine_wrapper
return run_coro_as_sync(ctx_call())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/zidar/programming/app/.venv/lib/python3.12/site-packages/prefect/utilities/asyncutils.py", line 243, in run_coro_as_sync
return call.result()
^^^^^^^^^^^^^
File "/Users/zidar/programming/app/.venv/lib/python3.12/site-packages/prefect/_internal/concurrency/calls.py", line 312, in result
return self.future.result(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/zidar/programming/app/.venv/lib/python3.12/site-packages/prefect/_internal/concurrency/calls.py", line 182, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/Users/zidar/.asdf/installs/python/3.12.1/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/Users/zidar/programming/app/.venv/lib/python3.12/site-packages/prefect/_internal/concurrency/calls.py", line 383, in _run_async
result = await coro
^^^^^^^^^^
File "/Users/zidar/programming/app/.venv/lib/python3.12/site-packages/prefect/utilities/asyncutils.py", line 225, in coroutine_wrapper
return await task
^^^^^^^^^^
File "/Users/zidar/programming/app/.venv/lib/python3.12/site-packages/prefect/utilities/asyncutils.py", line 389, in ctx_call
result = await async_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/zidar/programming/app/.venv/lib/python3.12/site-packages/prefect/deployments/runner.py", line 925, in deploy
image.build()
File "/Users/zidar/programming/app/.venv/lib/python3.12/site-packages/prefect/docker/docker_image.py", line 73, in build
build_image(**build_kwargs)
File "/Users/zidar/.asdf/installs/python/3.12.1/lib/python3.12/contextlib.py", line 81, in inner
return func(*args, **kwds)
^^^^^^^^^^^^^^^^^^^
File "/Users/zidar/programming/app/.venv/lib/python3.12/site-packages/prefect/utilities/dockerutils.py", line 194, in build_image
raise BuildError(event["error"])
prefect.utilities.dockerutils.BuildError: failed to export image: NotFound: content digest sha256:7fb66093b170bccb413f3e1c8f4b92fa440ea68fc4cddccf4c3b47e2673cfb9c: not found
If you change your location (with a VPN) to the US, the issue does not reproduce. If you use OrbStack instead of Docker Desktop, the issue also does not reproduce.
We figured this out because coworkers in US had no trouble creating the deployment, but others in the EU constantly get the error.
Example deployment code:
job.deploy(
work_pool_name=work_pool_name,
image=DockerImage(
name=docker_image_name,
platform="linux/amd64",
dockerfile="Dockerfile",
target=target,
),
)
Building the Dockerfile manually doesn't raise this error.
Version info
Version: 3.1.0
API version: 0.8.4
Python version: 3.12.6
Git commit: a83ba39b
Built: Thu, Oct 31, 2024 12:43 PM
OS/Arch: darwin/arm64
Profile: local
Server type: server
Pydantic version: 2.9.2
Integrations:
prefect-docker: 0.6.1
Additional context
This issue has also popped up in the Prefect Community Slack: https://prefect-community.slack.com/archives/CL09KU1K7/p1730205889746789
Just out of curiosity, does this also occur when explicitly passing the registry name in the image?
docker.io/your_username/image:tag
I haven't tried docker.io, but the issue reproduces with AWS ECR, no matter how I specify the registry name in the image.
The issue even reproduces if neither ECR nor docker.io configured and I'm building and using the image locally without pushing it to a remote repository.
We found a workaround for the issue, but haven't pinpointed the exact culprit yet.
If one removes platform="linux/amd64",, so:
job.deploy(
work_pool_name=work_pool_name,
image=DockerImage(
name=docker_image_name,
#platform="linux/amd64",
dockerfile="Dockerfile",
target=target,
),
)
and runs the deployment, even if the deployment fails (this is only in our case due to unrelated issue with some go package), and then adds back in the platform="linux/amd64",, the deployment goes through successfully on the second attempt.
ATM I can't reproduce the issue so I can't gather more data, but the solution worked for @anze3db today when he again stumbled upon it. When I was debugging the problem, the only thing that stood out and might be related was:
$ tail -f ~/Library/Containers/com.docker.docker/Data/log/vm/dockerd.log
...
time="2025-02-07T23:41:30.272689258Z" level=warning msg="failed to determine platform specific size" digest="sha256:6365712bd66a08e836f2308a17f0fef28f3358bc0249fd6e87fdc4ee7cb000f7" error="NotFound: content digest sha256:6365712bd66a08e836f2308a17f0fef28f3358bc0249fd6e87fdc4ee7cb000f7: not found" image="docker.io/prefecthq/prefect:3.0.11-python3.12" isPseudo=false manifest="{application/vnd.docker.distribution.manifest.v2+json sha256:6365712bd66a08e836f2308a17f0fef28f3358bc0249fd6e87fdc4ee7cb000f7 3256 [] map[] [] 0x4001d2f0e0 }"
...
So maybe something with platform isn't propagated and built correctly? But that's just speculation 🤷
Just an update that we've spent some more time debugging this with Docker employees and we've opened an issue about this on their Docker Desktop for Mac repo: https://github.com/docker/for-mac/issues/7607
One interesting thing that we've found was that this issue doesn't reproduce if you remove the labels parameter from the api call, so removing this line:
https://github.com/PrefectHQ/prefect/blob/99d94359bf5ea0f2f8a61e22c49fb25f8d7c7e33/src/prefect/utilities/dockerutils.py#L173
@teocns I'm not sure if removing labels from images will break anything, but it would resolve this particular issue. Ideally though, prefect should be using buildkit to build images, but that's probably more work on your side because buildkit isn't supported by docker-py.
This appears to have been closed upstream