acr
acr copied to clipboard
ACR streaming: failed to open remote file as tar file error
Describe the bug I'm evaluating ACR streaming preview, and hit a problem where my container cannot start when streaming is enabled.
kubelet is repeatedly logging errors like that:
Normal Pulling 8s (x2 over 41s) kubelet Pulling image "xxx.azurecr.io/test-alexp-jupyter-datascience-notebook:latest"
Warning Failed 8s kubelet Error: failed to create containerd container: failed to attach and mount for snapshot 218: failed to enable target for /sys/kernel/config/target/core/user_999999999/dev_218, failed:failed to open remote file as tar file /https://xxx.azurecr.io/v2/test-alexp-jupyter-datascience-notebook/blobs/sha256:f16ce562223807a933f8040b1c3ce2a617377e7f160826980d7f8c6fcc84bb2f: No such file or directory: unknown
It's interesting that there is a slash in front of "http" for the docker image url.
To Reproduce Steps to reproduce the behavior:
- I followed the instructions from https://medium.com/@rammadasu5/how-to-enable-artifact-streaming-on-your-aks-node-pools-to-stream-artifacts-from-acr-and-reduce-64bc22ba9788 , and used existing ACR registry and AKS node pool.
- Create a new deployment from an ACR copy of public jupyter-datascience-notebook image that has streaming enabled
- Container cannot start with CreateContainerError error and the error message above.
Expected behavior Container should start
Screenshots If applicable, add screenshots to help explain your problem.
Any relevant environment information
kubectl version
Client Version: v1.30.1
Server Version: v1.28.9
AKS cluster was deployed a few days ago and is on the latest version for the control plane and node pool.
AKS node info:
System Info:
Machine ID: 229240f927f1457daabe410ed4f53257
System UUID: 3f61de23-34b4-4744-83a1-182c5ce28e9d
Boot ID: bdc6a038-823b-4186-80d0-b44b37a0ec47
Kernel Version: 5.15.0-1064-azure
OS Image: Ubuntu 22.04.4 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: containerd://1.7.15-1
Kubelet Version: v1.28.9
Additional context Add any other context about the problem here.
If any information is a concern to post here, you can create a support ticket or send an email to [email protected].
Thanks for reporting this, I'm taking a look on our side --
It's interesting that there is a slash in front of "http" for the docker image url.
Did some digging and that appears to be normal.
Hello, @alexp-openai I tried to repro earlier today but was unable to get the same error. I tried using both the latest jupyter/datascience-notebook from docker and the latest from quay.io but was not able to repro. I do notice that the image in question I converted is not identical to the one you have converted by looking at acr logs so there may be something further in there. Would you be able to provide us with a more specific image (using a fixed tag or digest) that you are aware fails for us to verify with? We are committed to making sure the service is reliable for all workloads.
As a side note we are in the process of rolling out a new version of the underlying service responsible for conversion so I would suggest trying again in the next couple of days as that rolls out to verify if any of the fixes there affect your scenario. Beyond that I will continue to try reproducing and understanding what may have gone wrong.
So, this was a first public image that I have tried. I just pulled it from public docker hub last week. I can try with another one a bit later. Also maybe there is something wrong with my cluster setup. This AKS cluster was set up last week as well, so versions should be new.
If you have some suggestions on how to troubleshoot it further, feel free to share.
@alexp-openai If you debug the node w/ kubectl debug nodes/<node-name> -it --image bash (you'll need to do chroot /host when that connects) there are some logs you can collect,
- overlaybd logs -
- /var/log/overlaybd.log
- /var/log/overlaybd-audit.log
- overlaybd snapshotter logs -
journalctl -u overlaybd-snapshotter
- containerd logs
journalctl -u containerd
- dmesg
- acr-mirror logs
journalctl -u acr-mirror
Hi @alexp-openai just wanted to check in. Have you continued to encounter the issue? Is there any more info you would like us to take a look at? It might be best to follow up with a support ticket.
Closing since the issue has been open for three weeks with no further input. Please let us know if we can provide further assistance in a support ticket https://azure.microsoft.com/en-us/support/create-ticket/