community.docker
community.docker copied to clipboard
stopped docker_container with links fails to restart
SUMMARY
the docker_container module fails to force the recreation of an exited container when it has a linked container which was itself recreated
ISSUE TYPE
- Bug Report
COMPONENT NAME
docker_container
ANSIBLE VERSION
ansible [core 2.12.10]
config file = /home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg
configured module search path = ['/home/shk3bq4d/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /home/shk3bq4d/.virtualenvs/ansible/lib/python3.10/site-packages/ansible
ansible collection location = /home/shk3bq4d/git/shk3bq4d/myproject/ans
executable location = /home/shk3bq4d/.virtualenvs/ansible/bin/ansible
python version = 3.10.6 (main, Nov 2 2022, 18:53:38) [GCC 11.3.0]
jinja version = 3.0.3
libyaml = True
COLLECTION VERSION
# /home/shk3bq4d/.virtualenvs/ansible/lib/python3.10/site-packages/ansible_collections
Collection Version
---------------- -------
community.docker 2.6.0
# /home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible_collections
Collection Version
---------------- -------
community.docker 3.2.1
CONFIGURATION
COLLECTIONS_PATHS(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = ['/home/shk3bq4d/git/shk3bq4d/myproject/ans']
DEFAULT_ACTION_PLUGIN_PATH(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = ['/home/shk3bq4d/git/shk3bq4d/myproject/ans/actions-external']
DEFAULT_FORKS(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = 25
DEFAULT_HOST_LIST(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = ['/home/shk3bq4d/git/shk3bq4d/myproject/ans/inventory.yml']
DEFAULT_LOAD_CALLBACK_PLUGINS(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = True
DEFAULT_ROLES_PATH(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = ['/home/shk3bq4d/git/shk3bq4d/myproject/ans/roles', '/home/shk3bq4d/git/shk3bq4d/myproject/ans/roles-external', '/home/shk3bq4d/git/myotherproject/myproject-master/ans/roles']
DEFAULT_STDOUT_CALLBACK(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = yaml
DEPRECATION_WARNINGS(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = False
INVENTORY_UNPARSED_IS_FAILED(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = True
USE_PERSISTENT_CONNECTIONS(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = True
OS / ENVIRONMENT
Ubuntu 22.04.1 LTS
STEPS TO REPRODUCE
This example has a bunch of scenario setup pre_tasks that lead to the desired state just before the execution of the single task tasks fails. It uses yaml anchors extensively to ensure we have no typo and highlights the differences among very similar tasks, but those anchors could be easily removed without impacting the behavior.
- hosts: localhost
gather_facts: no
become: yes
vars:
container1: &container1
name: container1
image: busybox
command: tail -f /dev/null
container2: &container2
name: container2
image: busybox
command: tail -f /dev/null
links:
- container1:container1
pre_tasks: # setup initial conditions
- name: CLEANUP - ensure container1 does not exist
tags: cleanup
docker_container:
<<: *container1
state: absent
- name: CLEANUP - ensure container2 does not exist
tags: cleanup
docker_container:
<<: *container2
state: absent
- name: SCENARIO SETUP - start container1
tags: setup
docker_container: *container1
- name: SCENARIO SETUP - start container2
tags: setup
docker_container: *container2
- name: SCENARIO SETUP - stops container2
tags: setup
docker_container:
<<: *container2
state: stopped
- name: SCENARIO SETUP - recreate container1
tags: setup
docker_container:
<<: *container1
recreate: yes
tasks: # the part that fails, given the initial conditions
- name: POSSIBLE BUG - start container2
tags: scenario
docker_container: *container2
EXPECTED RESULTS
ACTUAL RESULTS
TASK [POSSIBLE BUG - start container2] *****************************************
task path: /home/shk3bq4d/git/shk3bq4d/myproject/ans/example.yml:55
redirecting (type: modules) ansible.builtin.docker_container to community.docker.docker_container
The full traceback is:
File "/tmp/ansible_docker_container_payload_0o_iocku/ansible_docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/module_container/module.py", line 718, in container_start
self.engine_driver.start_container(self.client, container_id)
File "/tmp/ansible_docker_container_payload_0o_iocku/ansible_docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/module_container/docker_api.py", line 270, in start_container
client.post_json('/containers/{0}/start', container_id)
File "/tmp/ansible_docker_container_payload_0o_iocku/ansible_docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/api/client.py", line 554, in post_json
self._raise_for_status(self._post_json(self._url(pathfmt, *args, versioned_api=True), data, **kwargs))
File "/tmp/ansible_docker_container_payload_0o_iocku/ansible_docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/api/client.py", line 261, in _raise_for_status
raise_from(create_api_error_from_http_exception(e), e)
File "/tmp/ansible_docker_container_payload_0o_iocku/ansible_docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/errors.py", line 45, in create_api_error_from_http_exception
raise_from(cls(e, response=response, explanation=explanation), e)
File "<string>", line 3, in raise_from
fatal: [mymachine]: FAILED! => changed=false
invocation:
module_args:
api_version: auto
auto_remove: null
blkio_weight: null
ca_cert: null
cap_drop: null
capabilities: null
cgroup_parent: null
cgroupns_mode: null
cleanup: false
client_cert: null
client_key: null
command: tail -f /dev/null
command_handling: correct
comparisons: null
container_default_behavior: no_defaults
cpu_period: null
cpu_quota: null
cpu_shares: null
cpus: null
cpuset_cpus: null
cpuset_mems: null
debug: false
default_host_ip: null
detach: null
device_read_bps: null
device_read_iops: null
device_requests: null
device_write_bps: null
device_write_iops: null
devices: null
dns_opts: null
dns_search_domains: null
dns_servers: null
docker_host: unix://var/run/docker.sock
domainname: null
entrypoint: null
env: null
env_file: null
etc_hosts: null
exposed_ports: null
force_kill: false
groups: null
healthcheck: null
hostname: null
ignore_image: false
image: busybox
image_comparison: desired-image
image_label_mismatch: ignore
image_name_mismatch: ignore
init: null
interactive: null
ipc_mode: null
keep_volumes: true
kernel_memory: null
kill_signal: null
labels: null
links:
- container1:container1
log_driver: null
log_options: null
mac_address: null
memory: null
memory_reservation: null
memory_swap: null
memory_swappiness: null
mounts: null
name: container2
network_mode: null
networks: null
networks_cli_compatible: true
oom_killer: null
oom_score_adj: null
output_logs: false
paused: null
pid_mode: null
pids_limit: null
platform: null
privileged: null
publish_all_ports: null
published_ports: null
pull: false
purge_networks: false
read_only: null
recreate: false
removal_wait_timeout: null
restart: false
restart_policy: null
restart_retries: null
runtime: null
security_opts: null
shm_size: null
ssl_version: null
state: started
stop_signal: null
stop_timeout: null
storage_opts: null
sysctls: null
timeout: 60
tls: false
tls_hostname: null
tmpfs: null
tty: null
ulimits: null
use_ssh_client: false
user: null
userns_mode: null
uts: null
validate_certs: false
volume_driver: null
volumes: null
volumes_from: null
working_dir: null
msg: 'Error starting container 2acd583d6d103c07453cdbaf796f117df347669b9f23c81c5f2e27f74a5c875b: 500 Server Error for http+docker://localhost/v1.41/containers/2acd583d6d103c07453cdbaf796f117df347669b9f23c81c5f2e27f74a5c875b/start: Internal Server Error ("Cannot link to a non running container: /container1 AS /container2/container1")'
PLAY RECAP *********************************************************************
mymachine : ok=6 changed=6 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
DESIRED RESULT
in my opinion and as per idempotency, the module should be robust and clever enough to recreate the container
here is my personal workaround for possible readers and in case this bug report isn't accepted or fixed. Thank you everyone for their amazing work!
- name: try / catch
block:
- name: creates container
docker_container: &myargs
name: container2
image: busybox
command: tail -f /dev/null
links:
- container1:container1
rescue:
- name: re creates container with restart=yes
docker_container:
<<: *myargs
restart: yes
I'm not sure there is anything we can really do (without doing some random black magic that will likely break other users), since this restriction comes from the Docker daemon.
One possibility is to query the state of the linked containers and detect that they no longer exists?
It sure seams a bit heavy, but wouldn't that do the trick?
But... why should the module do that? It will generally slow the module down for all users if it needs to query every single container that's mentioned in some option. And what should it do if that's the case, except failing with another error message?
But... why should the module do that?
I will definitively trust your judgement on this one, but am still trying my luck to share my opinion and that would be: because it is within the implied contract of the module.
The declarative intended state is "run this docker container with this name and with those parameters". The fact that there is a past existing container with the same name and similar, yet "corrupted" linked containers parameters should, I believe, not be a concern to the module user as (in my scenario) an updated valid reference for the linked container does exist on thy system.
If performance is the concern, then the specific error returned by the docker daemon could be caught and the module recall itself with recreate: yes
option, in the same way that the workaround from my first comment does (albeit it does not check what the error was). This would also limit the complexity of the fix by using the it's better to ask forgiveness than permission philosophy rather than having the module work too much.
And what should it do if that's the case, except failing with another error message?
In the scenario that I presented in the bug report, there's a possible non-failing outcome. I created it based on a "real life" deployment that failed for me and that I had hoped the module would fix natively.
I do agree that not every use case can be successfully solved, but my intention is to help by showing possible course of improvement of the module that would benefit users that would be trapped in this, I agree, very specific scenario.