ansible-minio icon indicating copy to clipboard operation
ansible-minio copied to clipboard

Returns Success even if Minio Service Didn't start Successfully

Open katia-e opened this issue 5 years ago • 4 comments

For example if odd number of shards is provided to MinIO configuration, playbook will install successfully, but MinIO service will fail to start. I'd expect in this case Playbook to report an error from systemctl. Currently it reports success, even if MinIO failed to start.

katia-e avatar May 22 '19 09:05 katia-e

Hi @katia-e, do you have any example of variables you used to configure the role to try to reproduce the issue?

atosatto avatar May 23 '19 08:05 atosatto

For example I set up the following config.yaml:

- name: "Install Minio"
  hosts: servers
  roles:
    - atosatto.minio
  vars:
    minio_server_datadirs:
      - "/mnt/vdb"
      - "/mnt/vdc"
      - "/mnt/vdd"

This config cannot succeed due to the number of minio_server_datadirs = 3 (should be 4+), so I expect minio to install correctly, but fail to start. In result I see no failed tasks:

PLAY RECAP *************************************************************************************************************
195.134.212.39             : ok=13   changed=1    unreachable=0    failed=0    skipped=3    rescued=0    ignored=0

But MinIO service failed to start on the remote machine. With this issue I was wondering if it's possible to return error if MinIO failed to start. Otherwise Ansible reports success, while some tasks failed.

katia-e avatar May 24 '19 15:05 katia-e

I discovered this by accident when trying to change users and permissions are wrong.

One problem I found is that some issues actually the playbook may be returning OK because this check

https://github.com/atosatto/ansible-minio/blob/15ad3a3604f658dbbe30e593d4c148983af60320/tasks/install-server.yml#L86-L90

Actually not fail. And this may happens because take some seconds to systemd actually report the minio as failed because minio may take some time to report as failed for real

I'm not very sure how to implement a check for this, but maybe one very simple way would be give at least a few seconds (for sure no less than 5 seconds) to check at at the end of the playbook if the service still working.

Note about molecule tests & check if services are running

These types of check are complicated to test with docker and may require much larger docker images that have systemd or equivalent. If the current docker images already not works with these checks, not enable such checks when running inside containers could still valid, since on real VPS they are likely to work fine.

fititnt avatar Jan 23 '20 11:01 fititnt

I think this is a general problem with Ansible, and/or maybe systemd. You can do a wait_for in a wrapping role — for the service to start up, and check based on this.

Otherwise, I've created a play to do this (I run 4 nodes, so adjust your output):

- hosts: minio
  gather_facts: no
  tasks:
    - name: Check systemd minio status
      command: systemctl status minio
      register: minio_status
      failed_when: not (minio_status.stdout_lines | select("match", "4 Online, 0 Offline."))

till avatar Jan 30 '20 15:01 till