community.docker icon indicating copy to clipboard operation
community.docker copied to clipboard

stopped docker_container with links fails to restart

Open shk3bq4d opened this issue 1 year ago • 5 comments

SUMMARY

the docker_container module fails to force the recreation of an exited container when it has a linked container which was itself recreated

ISSUE TYPE
  • Bug Report
COMPONENT NAME

docker_container

ANSIBLE VERSION
ansible [core 2.12.10]
  config file = /home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg
  configured module search path = ['/home/shk3bq4d/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/shk3bq4d/.virtualenvs/ansible/lib/python3.10/site-packages/ansible
  ansible collection location = /home/shk3bq4d/git/shk3bq4d/myproject/ans
  executable location = /home/shk3bq4d/.virtualenvs/ansible/bin/ansible
  python version = 3.10.6 (main, Nov  2 2022, 18:53:38) [GCC 11.3.0]
  jinja version = 3.0.3
  libyaml = True
COLLECTION VERSION
# /home/shk3bq4d/.virtualenvs/ansible/lib/python3.10/site-packages/ansible_collections
Collection       Version
---------------- -------
community.docker 2.6.0

# /home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible_collections
Collection       Version
---------------- -------
community.docker 3.2.1
CONFIGURATION
COLLECTIONS_PATHS(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = ['/home/shk3bq4d/git/shk3bq4d/myproject/ans']
DEFAULT_ACTION_PLUGIN_PATH(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = ['/home/shk3bq4d/git/shk3bq4d/myproject/ans/actions-external']
DEFAULT_FORKS(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = 25
DEFAULT_HOST_LIST(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = ['/home/shk3bq4d/git/shk3bq4d/myproject/ans/inventory.yml']
DEFAULT_LOAD_CALLBACK_PLUGINS(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = True
DEFAULT_ROLES_PATH(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = ['/home/shk3bq4d/git/shk3bq4d/myproject/ans/roles', '/home/shk3bq4d/git/shk3bq4d/myproject/ans/roles-external', '/home/shk3bq4d/git/myotherproject/myproject-master/ans/roles']
DEFAULT_STDOUT_CALLBACK(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = yaml
DEPRECATION_WARNINGS(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = False
INVENTORY_UNPARSED_IS_FAILED(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = True
USE_PERSISTENT_CONNECTIONS(/home/shk3bq4d/git/shk3bq4d/myproject/ans/ansible.cfg) = True
OS / ENVIRONMENT

Ubuntu 22.04.1 LTS

STEPS TO REPRODUCE

This example has a bunch of scenario setup pre_tasks that lead to the desired state just before the execution of the single task tasks fails. It uses yaml anchors extensively to ensure we have no typo and highlights the differences among very similar tasks, but those anchors could be easily removed without impacting the behavior.

- hosts: localhost
  gather_facts: no
  become: yes
  vars:

    container1: &container1
      name: container1
      image: busybox
      command: tail -f /dev/null

    container2: &container2
      name: container2
      image: busybox
      command: tail -f /dev/null
      links:
        - container1:container1

  pre_tasks: # setup initial conditions

    - name: CLEANUP - ensure container1 does not exist
      tags: cleanup
      docker_container:
        <<: *container1
        state: absent

    - name: CLEANUP - ensure container2 does not exist
      tags: cleanup
      docker_container:
        <<: *container2
        state: absent

    - name: SCENARIO SETUP - start container1
      tags: setup
      docker_container: *container1

    - name: SCENARIO SETUP - start container2
      tags: setup
      docker_container: *container2

    - name: SCENARIO SETUP - stops container2
      tags: setup
      docker_container:
        <<: *container2
        state: stopped

    - name: SCENARIO SETUP - recreate container1
      tags: setup
      docker_container:
        <<: *container1
        recreate: yes

  tasks: # the part that fails, given the initial conditions

    - name: POSSIBLE BUG - start container2
      tags: scenario
      docker_container: *container2
EXPECTED RESULTS
ACTUAL RESULTS
TASK [POSSIBLE BUG - start container2] *****************************************
task path: /home/shk3bq4d/git/shk3bq4d/myproject/ans/example.yml:55
redirecting (type: modules) ansible.builtin.docker_container to community.docker.docker_container
The full traceback is:
  File "/tmp/ansible_docker_container_payload_0o_iocku/ansible_docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/module_container/module.py", line 718, in container_start
    self.engine_driver.start_container(self.client, container_id)
  File "/tmp/ansible_docker_container_payload_0o_iocku/ansible_docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/module_container/docker_api.py", line 270, in start_container
    client.post_json('/containers/{0}/start', container_id)
  File "/tmp/ansible_docker_container_payload_0o_iocku/ansible_docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/api/client.py", line 554, in post_json
    self._raise_for_status(self._post_json(self._url(pathfmt, *args, versioned_api=True), data, **kwargs))
  File "/tmp/ansible_docker_container_payload_0o_iocku/ansible_docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/api/client.py", line 261, in _raise_for_status
    raise_from(create_api_error_from_http_exception(e), e)
  File "/tmp/ansible_docker_container_payload_0o_iocku/ansible_docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/errors.py", line 45, in create_api_error_from_http_exception
    raise_from(cls(e, response=response, explanation=explanation), e)
  File "<string>", line 3, in raise_from
fatal: [mymachine]: FAILED! => changed=false
  invocation:
    module_args:
      api_version: auto
      auto_remove: null
      blkio_weight: null
      ca_cert: null
      cap_drop: null
      capabilities: null
      cgroup_parent: null
      cgroupns_mode: null
      cleanup: false
      client_cert: null
      client_key: null
      command: tail -f /dev/null
      command_handling: correct
      comparisons: null
      container_default_behavior: no_defaults
      cpu_period: null
      cpu_quota: null
      cpu_shares: null
      cpus: null
      cpuset_cpus: null
      cpuset_mems: null
      debug: false
      default_host_ip: null
      detach: null
      device_read_bps: null
      device_read_iops: null
      device_requests: null
      device_write_bps: null
      device_write_iops: null
      devices: null
      dns_opts: null
      dns_search_domains: null
      dns_servers: null
      docker_host: unix://var/run/docker.sock
      domainname: null
      entrypoint: null
      env: null
      env_file: null
      etc_hosts: null
      exposed_ports: null
      force_kill: false
      groups: null
      healthcheck: null
      hostname: null
      ignore_image: false
      image: busybox
      image_comparison: desired-image
      image_label_mismatch: ignore
      image_name_mismatch: ignore
      init: null
      interactive: null
      ipc_mode: null
      keep_volumes: true
      kernel_memory: null
      kill_signal: null
      labels: null
      links:
      - container1:container1
      log_driver: null
      log_options: null
      mac_address: null
      memory: null
      memory_reservation: null
      memory_swap: null
      memory_swappiness: null
      mounts: null
      name: container2
      network_mode: null
      networks: null
      networks_cli_compatible: true
      oom_killer: null
      oom_score_adj: null
      output_logs: false
      paused: null
      pid_mode: null
      pids_limit: null
      platform: null
      privileged: null
      publish_all_ports: null
      published_ports: null
      pull: false
      purge_networks: false
      read_only: null
      recreate: false
      removal_wait_timeout: null
      restart: false
      restart_policy: null
      restart_retries: null
      runtime: null
      security_opts: null
      shm_size: null
      ssl_version: null
      state: started
      stop_signal: null
      stop_timeout: null
      storage_opts: null
      sysctls: null
      timeout: 60
      tls: false
      tls_hostname: null
      tmpfs: null
      tty: null
      ulimits: null
      use_ssh_client: false
      user: null
      userns_mode: null
      uts: null
      validate_certs: false
      volume_driver: null
      volumes: null
      volumes_from: null
      working_dir: null
  msg: 'Error starting container 2acd583d6d103c07453cdbaf796f117df347669b9f23c81c5f2e27f74a5c875b: 500 Server Error for http+docker://localhost/v1.41/containers/2acd583d6d103c07453cdbaf796f117df347669b9f23c81c5f2e27f74a5c875b/start: Internal Server Error ("Cannot link to a non running container: /container1 AS /container2/container1")'

PLAY RECAP *********************************************************************
  mymachine               : ok=6    changed=6    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0
DESIRED RESULT

in my opinion and as per idempotency, the module should be robust and clever enough to recreate the container

shk3bq4d avatar Nov 12 '22 15:11 shk3bq4d

here is my personal workaround for possible readers and in case this bug report isn't accepted or fixed. Thank you everyone for their amazing work!

- name: try / catch
  block:
  - name: creates container
    docker_container: &myargs
      name: container2
      image: busybox
      command: tail -f /dev/null
      links:
        - container1:container1
  rescue:
  - name: re creates container with restart=yes
    docker_container:
      <<: *myargs
      restart: yes

shk3bq4d avatar Nov 12 '22 15:11 shk3bq4d

I'm not sure there is anything we can really do (without doing some random black magic that will likely break other users), since this restriction comes from the Docker daemon.

felixfontein avatar Nov 12 '22 16:11 felixfontein

One possibility is to query the state of the linked containers and detect that they no longer exists?

It sure seams a bit heavy, but wouldn't that do the trick?

shk3bq4d avatar Nov 12 '22 16:11 shk3bq4d

But... why should the module do that? It will generally slow the module down for all users if it needs to query every single container that's mentioned in some option. And what should it do if that's the case, except failing with another error message?

felixfontein avatar Nov 12 '22 16:11 felixfontein

But... why should the module do that?

I will definitively trust your judgement on this one, but am still trying my luck to share my opinion and that would be: because it is within the implied contract of the module.

The declarative intended state is "run this docker container with this name and with those parameters". The fact that there is a past existing container with the same name and similar, yet "corrupted" linked containers parameters should, I believe, not be a concern to the module user as (in my scenario) an updated valid reference for the linked container does exist on thy system.

If performance is the concern, then the specific error returned by the docker daemon could be caught and the module recall itself with recreate: yes option, in the same way that the workaround from my first comment does (albeit it does not check what the error was). This would also limit the complexity of the fix by using the it's better to ask forgiveness than permission philosophy rather than having the module work too much.

And what should it do if that's the case, except failing with another error message?

In the scenario that I presented in the bug report, there's a possible non-failing outcome. I created it based on a "real life" deployment that failed for me and that I had hoped the module would fix natively.

I do agree that not every use case can be successfully solved, but my intention is to help by showing possible course of improvement of the module that would benefit users that would be trapped in this, I agree, very specific scenario.

shk3bq4d avatar Nov 12 '22 16:11 shk3bq4d