grafana-ansible-collection icon indicating copy to clipboard operation
grafana-ansible-collection copied to clipboard

grafana.grafana.alloy role installs alloy binary under /etc/alloy/, which does not work on RHEL-based systems due to SELinux

Open hakong opened this issue 2 months ago • 9 comments

Binaries should be in binary folders on RHEL-based systems, so SELinux allows them to do binary-like things, like connecting to the internet 😄

The alloy binary will get an selinux label like unconfined_u:object_r:etc_t:s0 when placed under /etc/alloy, which is not allowed to open tcp sockets:

type=AVC msg=audit(1714732141.534:5214): avc:  denied  { name_connect } for  pid=60115 comm="alloy-linux-amd" dest=443 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:http_port_t:s0 tclass=tcp_socket permissive=0
type=AVC msg=audit(1714732146.536:5215): avc:  denied  { name_connect } for  pid=60115 comm="alloy-linux-amd" dest=443 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:http_port_t:s0 tclass=tcp_socket permissive=0
type=AVC msg=audit(1714732146.537:5216): avc:  denied  { name_connect } for  pid=60115 comm="alloy-linux-amd" dest=443 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:http_port_t:s0 tclass=tcp_socket permissive=0
type=AVC msg=audit(1714732151.538:5217): avc:  denied  { name_connect } for  pid=60115 comm="alloy-linux-amd" dest=443 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:http_port_t:s0 tclass=tcp_socket permissive=0

The RPM package installs alloy under /usr/bin/, which is correct.

The grafana.grafana.alloy role should use the package manager to install alloy. That would solve this issue:

[root@container-1 ~]# rpm -qlp alloy-1.0.0-1.amd64.rpm
/etc/alloy/config.alloy
/etc/sysconfig/alloy
/usr/bin/alloy
/usr/lib/systemd/system/alloy.service

hakong avatar May 03 '24 10:05 hakong

Can you PR it up @hakong ?

gardar avatar May 03 '24 13:05 gardar

I can help with this @hakong @gardar as I'm looking to deploy alloy into production RHEL environments in the coming weeks using this ansible role and this will be an issue for me also.

Alternatively, or in addition to, an SELinux policy could be created that allows the existing binary deployment method as an interim.

panfantastic avatar May 10 '24 18:05 panfantastic

@panfantastic If you want to quick-and-dirty this as an interim solution, rather than messing with selinux you can just use something like this. It's what I used and it worked fine. RPM package installs alloy in some bin folder and it works fine with SELinux OOTB.

Note: I had an issue (at least on debian) that alloy would not start due to the /var/lib/alloy directory not existing and /etc/default/alloy not existing. The DEB package did not create them as far as I can remember. The RPM package worked fine.

- name: Configure Grafana YUM repository
  ansible.builtin.copy:
    dest: /etc/yum.repos.d/grafana.repo
    owner: root
    group: root
    mode: '0644'
    content: |
      [grafana]
      name=grafana
      baseurl=https://rpm.grafana.com
      repo_gpgcheck=1
      enabled=1
      gpgcheck=1
      gpgkey=https://rpm.grafana.com/gpg.key
      sslverify=1
      sslcacert=/etc/pki/tls/certs/ca-bundle.crt

- name: install alloy
  package:
    name: alloy
    state: present

- name: Configure Alloy
  vars:
    prometheus_push_endpoint: "https://prometheus-prod-24-prod-eu-west-2.grafana.net/api/prom/push" # Update with your Prometheus endpoint
    loki_endpoint: "https://logs-prod-012.grafana.net/loki/api/v1/push" # Update with your Loki endpoint
    prometheus_username: "x"  # Update with your Prometheus username
    prometheus_password: "x"  # Update with your Prometheus password
    loki_username: "x"  # Update with your Loki username, same as Grafana Cloud username if you are using Grafana Cloud
    loki_password: "x"  # Update with your Loki password, same as Grafana Cloud username if you are using Grafana Cloud
  ansible.builtin.copy:
    dest: /etc/alloy/config.alloy
    content: |
      prometheus.exporter.self "integrations_alloy" { }

      discovery.relabel "integrations_alloy" {
        targets = prometheus.exporter.self.integrations_alloy.targets

        rule {
          target_label = "instance"
          replacement  = constants.hostname
        }

        rule {
          target_label = "alloy_hostname"
          replacement  = constants.hostname
        }

        rule {
          target_label = "job"
          replacement  = "integrations/alloy-check"
        }
      }

      prometheus.scrape "integrations_alloy" {
        targets    = discovery.relabel.integrations_alloy.output
        forward_to = [prometheus.relabel.integrations_alloy.receiver]

        scrape_interval = "60s"
      }

      prometheus.relabel "integrations_alloy" {
        forward_to = [prometheus.remote_write.metrics_service.receiver]

        rule {
          source_labels = ["__name__"]
          regex         = "(prometheus_target_sync_length_seconds_sum|prometheus_target_scrapes_.*|prometheus_target_interval.*|prometheus_sd_discovered_targets|alloy_build.*|prometheus_remote_write_wal_samples_appended_total|process_start_time_seconds)"
          action        = "keep"
        }
      }

      prometheus.remote_write "metrics_service" {
        endpoint {
          url = "https://prometheus-prod-24-prod-eu-west-2.grafana.net/api/prom/push"

          basic_auth {
            username = "{{ prometheus_username }}"
            password = "{{ prometheus_password }}"
          }
        }
      }

      loki.write "grafana_cloud_loki" {
        endpoint {
          url = "https://logs-prod-012.grafana.net/loki/api/v1/push"

          basic_auth {
            username = "{{ loki_username }}"
            password = "{{ loki_password }}"
          }
        }
      }
      discovery.relabel "integrations_node_exporter" {
        targets = prometheus.exporter.unix.integrations_node_exporter.targets

        rule {
          target_label = "instance"
          replacement  = constants.hostname
        }

        rule {
          target_label = "job"
          replacement = "integrations/node_exporter"
        }
      }

      prometheus.exporter.unix "integrations_node_exporter" {
        disable_collectors = ["ipvs", "btrfs", "infiniband", "xfs", "zfs"]

        filesystem {
          fs_types_exclude     = "^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|tmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$"
          mount_points_exclude = "^/(dev|proc|run/credentials/.+|sys|var/lib/docker/.+)($|/)"
          mount_timeout        = "5s"
        }

        netclass {
          ignored_devices = "^(veth.*|cali.*|[a-f0-9]{15})$"
        }

        netdev {
          device_exclude = "^(veth.*|cali.*|[a-f0-9]{15})$"
        }
      }

      prometheus.scrape "integrations_node_exporter" {
        targets    = discovery.relabel.integrations_node_exporter.output
        forward_to = [prometheus.relabel.integrations_node_exporter.receiver]
      }

      prometheus.relabel "integrations_node_exporter" {
        forward_to = [prometheus.remote_write.metrics_service.receiver]

        rule {
          source_labels = ["__name__"]
          regex         = "up|node_arp_entries|node_boot_time_seconds|node_context_switches_total|node_cpu_seconds_total|node_disk_io_time_seconds_total|node_disk_io_time_weighted_seconds_total|node_disk_read_bytes_total|node_disk_read_time_seconds_total|node_disk_reads_completed_total|node_disk_write_time_seconds_total|node_disk_writes_completed_total|node_disk_written_bytes_total|node_filefd_allocated|node_filefd_maximum|node_filesystem_avail_bytes|node_filesystem_device_error|node_filesystem_files|node_filesystem_files_free|node_filesystem_readonly|node_filesystem_size_bytes|node_intr_total|node_load1|node_load15|node_load5|node_md_disks|node_md_disks_required|node_memory_Active_anon_bytes|node_memory_Active_bytes|node_memory_Active_file_bytes|node_memory_AnonHugePages_bytes|node_memory_AnonPages_bytes|node_memory_Bounce_bytes|node_memory_Buffers_bytes|node_memory_Cached_bytes|node_memory_CommitLimit_bytes|node_memory_Committed_AS_bytes|node_memory_DirectMap1G_bytes|node_memory_DirectMap2M_bytes|node_memory_DirectMap4k_bytes|node_memory_Dirty_bytes|node_memory_HugePages_Free|node_memory_HugePages_Rsvd|node_memory_HugePages_Surp|node_memory_HugePages_Total|node_memory_Hugepagesize_bytes|node_memory_Inactive_anon_bytes|node_memory_Inactive_bytes|node_memory_Inactive_file_bytes|node_memory_Mapped_bytes|node_memory_MemAvailable_bytes|node_memory_MemFree_bytes|node_memory_MemTotal_bytes|node_memory_SReclaimable_bytes|node_memory_SUnreclaim_bytes|node_memory_ShmemHugePages_bytes|node_memory_ShmemPmdMapped_bytes|node_memory_Shmem_bytes|node_memory_Slab_bytes|node_memory_SwapTotal_bytes|node_memory_VmallocChunk_bytes|node_memory_VmallocTotal_bytes|node_memory_VmallocUsed_bytes|node_memory_WritebackTmp_bytes|node_memory_Writeback_bytes|node_netstat_Icmp6_InErrors|node_netstat_Icmp6_InMsgs|node_netstat_Icmp6_OutMsgs|node_netstat_Icmp_InErrors|node_netstat_Icmp_InMsgs|node_netstat_Icmp_OutMsgs|node_netstat_IpExt_InOctets|node_netstat_IpExt_OutOctets|node_netstat_TcpExt_ListenDrops|node_netstat_TcpExt_ListenOverflows|node_netstat_TcpExt_TCPSynRetrans|node_netstat_Tcp_InErrs|node_netstat_Tcp_InSegs|node_netstat_Tcp_OutRsts|node_netstat_Tcp_OutSegs|node_netstat_Tcp_RetransSegs|node_netstat_Udp6_InDatagrams|node_netstat_Udp6_InErrors|node_netstat_Udp6_NoPorts|node_netstat_Udp6_OutDatagrams|node_netstat_Udp6_RcvbufErrors|node_netstat_Udp6_SndbufErrors|node_netstat_UdpLite_InErrors|node_netstat_Udp_InDatagrams|node_netstat_Udp_InErrors|node_netstat_Udp_NoPorts|node_netstat_Udp_OutDatagrams|node_netstat_Udp_RcvbufErrors|node_netstat_Udp_SndbufErrors|node_network_carrier|node_network_info|node_network_mtu_bytes|node_network_receive_bytes_total|node_network_receive_compressed_total|node_network_receive_drop_total|node_network_receive_errs_total|node_network_receive_fifo_total|node_network_receive_multicast_total|node_network_receive_packets_total|node_network_speed_bytes|node_network_transmit_bytes_total|node_network_transmit_compressed_total|node_network_transmit_drop_total|node_network_transmit_errs_total|node_network_transmit_fifo_total|node_network_transmit_multicast_total|node_network_transmit_packets_total|node_network_transmit_queue_length|node_network_up|node_nf_conntrack_entries|node_nf_conntrack_entries_limit|node_os_info|node_sockstat_FRAG6_inuse|node_sockstat_FRAG_inuse|node_sockstat_RAW6_inuse|node_sockstat_RAW_inuse|node_sockstat_TCP6_inuse|node_sockstat_TCP_alloc|node_sockstat_TCP_inuse|node_sockstat_TCP_mem|node_sockstat_TCP_mem_bytes|node_sockstat_TCP_orphan|node_sockstat_TCP_tw|node_sockstat_UDP6_inuse|node_sockstat_UDPLITE6_inuse|node_sockstat_UDPLITE_inuse|node_sockstat_UDP_inuse|node_sockstat_UDP_mem|node_sockstat_UDP_mem_bytes|node_sockstat_sockets_used|node_softnet_dropped_total|node_softnet_processed_total|node_softnet_times_squeezed_total|node_systemd_unit_state|node_textfile_scrape_error|node_time_zone_offset_seconds|node_timex_estimated_error_seconds|node_timex_maxerror_seconds|node_timex_offset_seconds|node_timex_sync_status|node_uname_info|node_vmstat_oom_kill|node_vmstat_pgfault|node_vmstat_pgmajfault|node_vmstat_pgpgin|node_vmstat_pgpgout|node_vmstat_pswpin|node_vmstat_pswpout|process_max_fds|process_open_fds"
          action        = "keep"
        }
      }

hakong avatar May 10 '24 19:05 hakong

Yes, we need to split the installs between redhat and debian. Thanks for your config, I'll try and get a PR together this weekend for this stuff unless you have something ready to go.

panfantastic avatar May 10 '24 19:05 panfantastic

@hakong Am I correct to say you were in a getenforce is 1 state?

I've been trying to get the build environment tests used to do selinux like you get in rhel by default and have failed so far.

RHEL is selinux by default, but the containers for rocky etc I'm trying to test with aren't :( I'm not sure how to submit a patch with molecule testing at this point!

panfantastic avatar May 12 '24 14:05 panfantastic

If you're seeking inspiration, perhaps aligning your PR with the Loki/Promtail approach. They already support Debian/RHEL systems with SELinux out of the box.

voidquark avatar May 13 '24 08:05 voidquark

@voidquark can you link me please?

panfantastic avatar May 13 '24 19:05 panfantastic

@voidquark can you link me please?

Promtail role and Loki role

voidquark avatar May 13 '24 20:05 voidquark

I took the opportunity to install it on a working selinux (enforcing) system and it installs fine so I think all that is needed is to split the install process between redhat clones and debian clones (sorry any suse clones, Gentoo knows how to do it on their own ;) ).

panfantastic avatar May 18 '24 00:05 panfantastic

Keeping an eye on this as the documentation for Grafana Agent says it's being deprecated for Alloy, and we run RHEL

Aethylred avatar May 19 '24 23:05 Aethylred