grafana-ansible-collection
                                
                                
                                
                                    grafana-ansible-collection copied to clipboard
                            
                            
                            
                        Add a config check before restarting mimir
Hi
This modifies the mimir restart logic to ensure the configuration is valid before issuing the restart of the service. Flushing handlers to ensure it is restarted before validating the service is running/responsive.
Did you consider using validate:?
https://docs.ansible.com/ansible/latest/collections/ansible/builtin/template_module.html#parameter-validate
The validation command to run before copying the updated file into the final destination. A temporary file path is used to validate, passed in through ‘%s’ which must be present as in the examples below. Also, the command is passed securely so shell features such as expansion and pipes will not work. For an example on how to handle more complex validation than what this option provides, see handling complex validation.
Something like (have not tested this):
- name: Template Mimir config - /etc/mimir/config.yml
  ansible.builtin.template:
    src: "config.yml.j2"
    dest: "/etc/mimir/config.yml"
    owner: "mimir"
    group: "mimir"
    mode: "0644"
    validate: "mimir --config.file=%s"
  notify:
    - Restart mimir
                                    
                                    
                                    
                                
@hakong thats a new option to me! :D I'll try it out, but even so this will stop execution if the config file is wrong - I could have gone more complex into backing up the file before changing it but I just wanted to start simple.
You can back up with the backup: parameter :)
https://docs.ansible.com/ansible/latest/collections/ansible/builtin/template_module.html#parameter-backup
Create a backup file including the timestamp information so you can get the original file back if you somehow clobbered it incorrectly.
Choices: false ← (default) true
So all in all something like:
- name: Template Mimir config - /etc/mimir/config.yml
  ansible.builtin.template:
    src: "config.yml.j2"
    dest: "/etc/mimir/config.yml"
    owner: "mimir"
    group: "mimir"
    mode: "0644"
    validate: "mimir --config.file='%s'"
    backup: true
  notify:
    - Restart mimir
                                    
                                    
                                    
                                
cc @GVengelen @voidquark
Hello @panfantastic,
It would be great if you utilize validate in a manner similar to Loki validation.
- Loki example:
 
- name: Template Loki config - /etc/loki/config.yml
  ansible.builtin.template:
.....
    validate: "/usr/bin/loki --verify-config -config.file %s"
  notify: restart loki
Also, could you please use the Fully Qualified Collection Name (FQCN) for the meta module, like this:
- name: Flush handlers after deployment
ansible.builtin.meta: flush_handlers
Additionally, please move the flush handler above the task - name: Ensure that Mimir is started. This adjustment is crucial because the task ensuring that Mimir is started needs to be executed after the flush.
Thanks for the feed back! I really appreciate it!
There isn't a verify-config option as far as I can see for alloy so I'll assume you mean the same structure.
I"ll send an update to the PR shortly.
I've been trying to submit the change to my fork but Github Actions is failing for some odd reason. I'm pretty tired now so will call it a night and try and pick this back up tomorrow.
Regarding validation, I don't have the place to test it at the moment, but I believe you can simply follow the steps outlined in this documentation. Just make sure you're using %s parameter-validate.
I'm unsure why the restart feature has been removed, but each configuration template requires a trigger handler to restart Mimir once the configuration is changed.
In my honest opinion, having a backup for the template module is unnecessary. The role should template what is in the inventory and always reflect the state of the file. It's up to you to use Git to maintain proper inventory changes, and due to this, I believe the backup should be removed.
The alternative view is that you shouldn't leave the system in a broken state such that you can't restart the service - so a change that alters a config should be reversible - i.e. you make a broken change to a config, then the system should replace the good config back rather than break the config on the system. This has never been easy to accomplish in ansible - unless it is possible now.
The alternative view is that you shouldn't leave the system in a broken state such that you can't restart the service - so a change that alters a config should be reversible - i.e. you make a broken change to a config, then the system should replace the good config back rather than break the config on the system. This has never been easy to accomplish in ansible - unless it is possible now.
I believe everything will remain clean. If you make any configuration changes and validation fails, the real config file won't be replaced with the faulty configuration. A restart is only triggered by handler when a valid configuration is templated, ensuring that only when a valid configuration is in place will a restart occur.
I neglected to read the detail on %s - I think you are correct.
It looks like the github actions are now passing on my fork. I've not tested this against my test cluster yet, but if you are ok to send it for another stage of accepting the submission then lets try that.
In my opinion, the last thing that needs to be addressed is the modification of the validation line.
validate: "mimir -modules --config.file=%s"
I removed the -print.config part, considering it unnecessary. It's preferable to adhere to the documentation at https://grafana.com/docs/mimir/latest/configure/about-configurations/#validate-a-configuration.
Setting up mimir was very k8s heavy in suggestion - I don't use k8s so missed that section of the documentation.
I think it can be merged now.