grafana-ansible-collection
grafana-ansible-collection copied to clipboard
BUG: Download grafana agent archive to local folder in case of different arch
I have two hosts in inventory. One machine is amd64 and another is arm64.
While running ansible-playbook on my pc, it works fine.
TASK [grafana.grafana.grafana_agent : Create Grafana Agent temp directory] ****************************************************************************************************************************************
ok: [mon-vm -> localhost]
TASK [grafana.grafana.grafana_agent : Download Grafana Agent archive to local folder] *****************************************************************************************************************************
changed: [mon-vm -> localhost]
changed: [dev-be1 -> localhost]
TASK [grafana.grafana.grafana_agent : Extract grafana-agent.zip] **************************************************************************************************************************************************
.fcst....?? grafana-agent-linux-arm64
changed: [mon-vm -> localhost]
.fcst....?? grafana-agent-linux-amd64
changed: [dev-be1 -> localhost]
TASK [grafana.grafana.grafana_agent : Set local path] *************************************************************************************************************************************************************
ok: [mon-vm]
ok: [dev-be1]
TASK [grafana.grafana.grafana_agent : Propagate downloaded binary] ************************************************************************************************************************************************
ok: [mon-vm]
diff skipped: destination file appears to be binary
diff skipped: source file size is greater than 104448
changed: [dev-be1]
While same playbook on gitlab ci/cd pipeline does not repeat download archive and downloads only amd64 binary
TASK [grafana.grafana.grafana_agent : Create Grafana Agent temp directory] *****
--- before
+++ after
@@ -1,5 +1,5 @@
{
- "mode": "0755",
+ "mode": "0751",
"path": "/tmp/grafana-agent",
- "state": "absent"
+ "state": "directory"
}
changed: [ssxmon-vm -> localhost]
TASK [grafana.grafana.grafana_agent : Download Grafana Agent archive to local folder] ***
changed: [ssxmon-vm -> localhost]
TASK [grafana.grafana.grafana_agent : Extract grafana-agent.zip] ***************
>f++++++.?? grafana-agent-linux-arm64
changed: [ssxmon-vm -> localhost]
TASK [grafana.grafana.grafana_agent : Set local path] **************************
ok: [ssxmon-vm]
ok: [ssxdev-be1]
TASK [grafana.grafana.grafana_agent : Propagate downloaded binary] *************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: If you are using a module and expect the file to exist on the remote, see the remote_src option
fatal: [ssxdev-be1]: FAILED! => {"changed": false, "msg": "Could not find or access '/tmp/grafana-agent/grafana-agent-linux-amd64' on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src option"}
ok: [ssxmon-vm]
Looking at role task
- name: Download Grafana Agent archive to local folder
become: false
ansible.builtin.get_url:
url: "{{ _grafana_agent_download_url }}"
dest: "{{ grafana_agent_local_tmp_dir }}/grafana-agent_{{ _grafana_agent_cpu_arch }}_{{ grafana_agent_version }}.zip"
mode: 0664
register: _download_archive
until: _download_archive is succeeded
retries: 5
delay: 2
delegate_to: localhost
check_mode: false
run_once: true
it has option "run_once: true". Now I'm confused why did repeat download on local env, while pipeline did honor run_once parameter.
Anyway, I think run_once should not be here or it should be solved in some different way. On other hand, this run_once is handy when I run script over high amount of VMs.
did you find any workaround for the same, getting same issue while running it on bunch of hosts having both arm64 and amd64 type archs
Hey @devmittal02, Haven't checked it out as we are building a new role for Grafana Agent which is for flow mode (recommended way now) so probably can test this out on that.
If you wanna double check, we have a PR open so I can get any changes you want in that right now.
My "workaround" is to group arm and amd VM in different groups and run 2 pipelines with interntory limit (-l)
This seems a very weird issue, @davordbetter any thoughts on why this is specially failing on GitLab?
@devmittal02 What platform are you running the playbook on?
Hey i think the issue is because of this run once, i am running on AWX to the entire fleet of ec2 machines, it spins up a on demand container and triggeres the playbook across the machines using SSM,
What's happening is lets say for 1st machine when it ran lets say that was AMD, so it downloaded the binary for that only and store in local, next time when ARM machine comes , it skips download step because of "run once" and copies only the previous AMD variant of binary, hence the issue of file doesn't exists, as it is a wrong binary
- name: Download Grafana Agent binary to controller (localhost)
block:
- name: Create Grafana Agent temp directory
become: false
ansible.builtin.file:
path: "{{ grafana_agent_local_tmp_dir }}"
state: directory
mode: 0751
delegate_to: localhost
check_mode: false
run_once: true
- name: Download Grafana Agent archive to local folder
become: false
ansible.builtin.get_url:
url: "{{ _grafana_agent_download_url }}"
dest: "{{ grafana_agent_local_tmp_dir }}/grafana-agent_{{ _grafana_agent_cpu_arch }}_{{ grafana_agent_version }}.zip"
mode: 0664
register: _download_archive
until: _download_archive is succeeded
retries: 5
delay: 2
delegate_to: localhost
check_mode: false
run_once: true
- name: Extract grafana-agent.zip
become: false
ansible.builtin.unarchive:
src: "{{ grafana_agent_local_tmp_dir }}/grafana-agent_{{ _grafana_agent_cpu_arch }}_{{ grafana_agent_version }}.zip"
dest: "{{ grafana_agent_local_tmp_dir }}"
remote_src: false
delegate_to: localhost
run_once: true
@ishanjainn can't figure it out, why same docker image with roles runs on my pc with both binaries, on gitlab pipeline only one (which is correct acorting to role run_once).
But only difference is that my pc is M2 macbook (emulated amd64 docker image) while gitlab runner runs on amd64 linux ubuntu vm.
The issue is indeed that the task has "run_once" It downloads the zip according the the facts of the first host, if that host contains a different cpu architecture than the others then that's going to cause the issue described.
Until this gets fixed the simplest workaround would be to separate the hosts based on cpu architecture in the playbook that executes the role.
Something like this:
inventory/hosts
[amd64_hosts]
example.host.tld
[arm64_hosts]
arm.host.tld
playbook.grafana_agent.yml
---
- name: Grafana agent on amd64 hosts
hosts: amd64_hosts
roles:
- role: grafana.grafana.grafana_agent
- name: Grafana agent on amd64 hosts
hosts: arm64_hosts
roles:
- role: grafana.grafana.grafana_agent
Based on the message in the Grafana Agent documentation:
Grafana Alloy is the new name for our distribution of the OTel collector. Grafana Agent has been deprecated and is in Long-Term Support (LTS) through October 31, 2025. Grafana Agent will reach an End-of-Life (EOL) on November 1, 2025. Read more about why we recommend migrating to Grafana Alloy.
I believe this can be closed, and migration to Alloy is required. @ishanjainn, what are your thoughts?
Need to reopen again, but this would be really nice to be solved and I don't see that it should be a big issue to solve. Migration to alloy will take some time, meanwhile we need to support existing environment.