cirrus-ci-docs icon indicating copy to clipboard operation
cirrus-ci-docs copied to clipboard

"Agent is not responding" with Windows + Docker

Open smx-smx opened this issue 3 years ago • 3 comments

Hello I write on behalf of the Reko opensource project (as a contributor)

We're having problems with the Windows build agent, which is running on top of a Docker image on the community cluster. It has worked for a while, but then started to give us problems and is currently unusable.

Expected Behavior

The container should start and produce some output

Real Behavior

The container is terminated after 15 minutes of wait time with the error message "Agent is not responding!": https://cirrus-ci.com/task/4597324635701248 A workaround is to force the rebuild of the Docker image, but the problem will re-occur in a few days, as if the docker image is getting deleted. Changing the VM configuration (cores/memory) didn't work

Related Info

This is a (tick one of the following):

  • [ ] Website issue
    • Link to page:
  • [x] Task issue

Thanks in advance

smx-smx avatar Dec 04 '21 19:12 smx-smx

In this particular case the reason of the error is a timeout for pulling a Docker image. Windows images are tend to be very large and especially the cmake one that you use as the base image.

Right now the tagging of the images is not consistent at all and it's is not helping cases like yours to optimize for cachability. Attempting to change it in cirruslabs/docker-images-windows#27 and cirruslabs/vm-images#13. Once we tag cirruslabs/docker-images-windows with 2021.12 you'll be able to reference it in your Dockerfile. This will improve and most importantly persist cachability for your tasks and they will start faster and more consistent.

fkorotkov avatar Dec 06 '21 19:12 fkorotkov

Thanks for the analysis and progress on this, as well as the great service you're providing with CirrusCI.

smx-smx avatar Dec 07 '21 16:12 smx-smx

I tried the new tagged image, but the push still took 30 minutes and resulted in a 19GB image here: https://console.cloud.google.com/gcr/images/cirrus-ci-community/GLOBAL/uxmal/reko/ci/windows/dockerfile

I wonder if i'll still run in the timeout in the future (last time i rebuilt the image it lasted for a short period of time)

smx-smx avatar Dec 08 '21 22:12 smx-smx

Closing because it's been good so far and the issue hasn't occurred anymore. Thanks

smx-smx avatar Oct 08 '22 17:10 smx-smx