terraform-provider-docker
terraform-provider-docker copied to clipboard
Flaky `Error response from daemon: Conflict, cannot remove the default link name of the container` on `terraform destroy`
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Terraform (and docker Provider) Version
Tested with Terraform versions v1.8.0
, v1.7.0
, v1.7.5
, v1.6.6
, v1.5.7
Docker provider at version v3.0.2
Docker server at versions v26.0.0
, v24.0.5
, v25.0.3
containerd at versions v1.7.13
, v1.7.14
, v1.7.15
This list of versions are just the ones I tried. I was unable to find a version combination that would guarantee this error from disappearing.
Affected Resource(s)
-
docker_container
Terraform Configuration Files
terraform {
required_providers {
docker = {
source = "kreuzwerker/docker"
version = "3.0.2"
}
}
}
provider "docker" {
host = "unix:///var/run/docker.sock"
}
resource "docker_image" "test_image" {
name = "alpine:latest"
keep_locally = true
}
resource "docker_container" "test_container" {
name = "test-container"
image = docker_image.test_image.image_id
rm = true
command = [
"/bin/sh",
"-c",
"while sleep 3600; do :; done",
]
}
Debug Output
You can find the debug output from a failing run using the above Terraform config here: https://gist.github.com/daniel-weisse/ba8e9e1757c5de14808a7e3a550ed556 The output was generated using the following:
TF_LOG=DEBUG terraform apply -auto-approve
sleep 10
TF_LOG=DEBUG terraform destroy -auto-approve
Expected Behaviour
The Docker Terraform provider correctly terminates the container without errors every time it is called.
Actual Behaviour
Rarely, the Docker Terraform provider throws an error when running terraform destroy
:
Error deleting container <container-id>: Error response from daemon: Conflict, cannot remove the default link name of the container
Steps to Reproduce
Assuming the Terraform config above is saved locally to main.tf
, run the following script:
#!/bin/bash
set -e
terraform init
for i in range {0..300}
do
terraform apply -auto-approve
sleep 10
terraform destroy -auto-approve
done
Since this bug seems very flaky, you may not see any failures, or it might fail at the very first iteration.
You can view a minimal Terraform configuration to create a VM in Azure with Ubuntu 22.04 here: https://gist.github.com/daniel-weisse/b44388adbb7f22e79e2964804d12b333 I used this to reproduce the error:
- Run
terraform apply
- Get the ssh key
terraform output -raw ssh_private_key > ssh.pem chmod 0600 ssh.pem
- Copy the reproduction Terraform config and test script to the VM:
scp -i ssh.pem main.tf test.sh adminuser@$(terraform output -raw public_ip):/home/adminuser
- Connect to the VM and try to reproduce the error:
ssh -i ssh.pem adminuser@$(terraform output -raw public_ip) ./test.sh
Important Factoids
I most commonly experienced this issue when running on Ubuntu 22.04 in an Azure VM. I also reproduced it locally running Arch Linux. I was unable to reproduce it locally on Fedora 39 in over 300 runs.
I would like to again repeat that this issue is very flakey. Sometimes I could reproduce it 1 in 5 runs, sometimes 300 runs didn't throw any errors.