icinga2
icinga2 copied to clipboard
Integrate jemalloc
... to speed up malloc(3) and thereby also the config loading:
fixes #8110
Also seems to fix a memory leak for some people:
fixes #8737
Reason: https://www.softwareverify.com/blog/memory-fragmentation-your-worst-nightmare/
After merge
- [ ] https://git.icinga.com/packaging/deb-icinga2/-/merge_requests/5
- [ ] https://git.icinga.com/packaging/rpm-icinga2/-/merge_requests/5
- [x] https://git.icinga.com/packaging/raspbian-icinga2/-/merge_requests/1
- [ ] https://github.com/Icinga/docker-icinga2/pull/22
Numbers: https://github.com/Icinga/docker-icinga2/pull/22#issuecomment-669802604
@lippserd FYI: Independent of this PR a regex index doesn't make it much faster.
diff --git a/lib/base/scriptutils.cpp b/lib/base/scriptutils.cpp
index 838f20edd..001dc86ed 100644
--- a/lib/base/scriptutils.cpp
+++ b/lib/base/scriptutils.cpp
@@ -16,8 +16,12 @@
#include "base/namespace.hpp"
#include "config/configitem.hpp"
#include <boost/regex.hpp>
+#include <boost/thread/locks.hpp>
+#include <boost/thread/shared_mutex.hpp>
#include <algorithm>
#include <set>
+#include <string>
+#include <unordered_map>
#ifdef _WIN32
#include <msi.h>
#endif /* _WIN32 */
@@ -93,6 +97,11 @@ bool ScriptUtils::CastBool(const Value& value)
return value.ToBool();
}
+static struct {
+ std::unordered_map<std::string, boost::regex> Index;
+ boost::shared_mutex Mutex;
+} l_Regexes;
+
bool ScriptUtils::Regex(const std::vector<Value>& args)
{
if (args.size() < 2)
@@ -111,7 +120,25 @@ bool ScriptUtils::Regex(const std::vector<Value>& args)
else
mode = MatchAll;
- boost::regex expr(pattern.GetData());
+ const boost::regex* expr = nullptr;
+
+ {
+ auto key (pattern.GetData());
+ boost::upgrade_lock<boost::shared_mutex> shared (l_Regexes.Mutex);
+ auto pos (l_Regexes.Index.find(key));
+
+ if (pos == l_Regexes.Index.end()) {
+ boost::upgrade_to_unique_lock<boost::shared_mutex> unique (shared);
+
+ pos = l_Regexes.Index.find(key);
+
+ if (pos == l_Regexes.Index.end()) {
+ pos = l_Regexes.Index.emplace(key, key).first;
+ }
+ }
+
+ expr = &pos->second;
+ }
Array::Ptr texts;
@@ -128,7 +155,7 @@ bool ScriptUtils::Regex(const std::vector<Value>& args)
bool res = false;
try {
boost::smatch what;
- res = boost::regex_search(text.GetData(), what, expr);
+ res = boost::regex_search(text.GetData(), what, *expr);
} catch (boost::exception&) {
res = false; /* exception means something went terribly wrong */
}
@@ -144,7 +171,7 @@ bool ScriptUtils::Regex(const std::vector<Value>& args)
} else {
String text = argTexts;
boost::smatch what;
- return boost::regex_search(text.GetData(), what, expr);
+ return boost::regex_search(text.GetData(), what, *expr);
}
}
They'd have to install jemalloc themselves anyways
No, see https://github.com/Icinga/icinga2/pull/8152#issuecomment-692632352.
~While you’re reviewing, I'll re-evaluate https://github.com/Icinga/icinga2/pull/8152#issuecomment-692632352 , so please don’t merge, yet.~
Why not? We've already added Boost Coroutine in the recent past.
Do we have enough data to conclude that jemalloc is an improvement in all or at least most situations? For me this is more like a tunable that you could try, maybe it's an improvement with your system and config, maybe not, we don't know for sure (yet).
In big envs: I have no doubt.
In small ones: even if we make it worse, the users won’t even notice.
Do you have any data to back this? Do we have any experience with running production setups with jemalloc? Take a look at this Stack Overflow answer. Do we know that Icinga won't use twice as much RAM when running with jemalloc in the long run?
Of course, a random Stack Overflow answer doesn't mean that this will happen to Icinga, but I think the conclusion in that answer is very valid: you just have to test it with your application. And as far as I can tell, the only data we have so far is that you've spun up Icinga 2 with jemalloc in Docker briefly.
I don't claim that using jemalloc will lead to problems, but we don't know for now. With the current version of the PR our builds are linked against jemalloc and you can't use them without. Should there be a problem, we'd have to push out new builds. With the previous version (i.e. adding jemalloc.so
to LD_PRELOAD
), you'd at least have to option to uninstall jemalloc or patch the startup script to get back the old behavior.
I'd feel much more confident about this if in the beginning, we'd view jemalloc as a potential optimization that you can try to speed up your installation and only enforce it as the default once we've gathered enough experience with this to conclude that using jemalloc will be an improvement in general.
I'll spin up some test boxes w/ Icinga and Grafana and report the graphs in about one week.
aklimov8152xyz
- h = helper
- a = w/o jemalloc
- b = w/ jemalloc
- m = master
- s = satellite
deploy.tf
resource "openstack_compute_instance_v2" "aklimov8152h" {
name = "aklimov8152h"
region = "HetznerNBG4"
flavor_name = "s1.medium"
block_device {
uuid = "${var.openstack_image}"
source_type = "image"
boot_index = 0
destination_type = "volume"
volume_size = 50
delete_on_termination = true
}
network {
name = "${var.tenant_network}"
}
security_groups = [ "default2" ]
key_pair = "${var.openstack_keypair}"
}
resource "openstack_compute_instance_v2" "aklimov8152am1" {
name = "aklimov8152am1"
region = "HetznerNBG4"
flavor_name = "s1.xxlarge"
block_device {
uuid = "${var.openstack_image}"
source_type = "image"
boot_index = 0
destination_type = "volume"
volume_size = 50
delete_on_termination = true
}
network {
name = "${var.tenant_network}"
}
security_groups = [ "default2" ]
key_pair = "${var.openstack_keypair}"
}
resource "openstack_compute_instance_v2" "aklimov8152am2" {
name = "aklimov8152am2"
region = "HetznerNBG4"
flavor_name = "s1.xxlarge"
block_device {
uuid = "${var.openstack_image}"
source_type = "image"
boot_index = 0
destination_type = "volume"
volume_size = 50
delete_on_termination = true
}
network {
name = "${var.tenant_network}"
}
security_groups = [ "default2" ]
key_pair = "${var.openstack_keypair}"
}
resource "openstack_compute_instance_v2" "aklimov8152as1" {
name = "aklimov8152as1"
region = "HetznerNBG4"
flavor_name = "s1.xxlarge"
block_device {
uuid = "${var.openstack_image}"
source_type = "image"
boot_index = 0
destination_type = "volume"
volume_size = 50
delete_on_termination = true
}
network {
name = "${var.tenant_network}"
}
security_groups = [ "default2" ]
key_pair = "${var.openstack_keypair}"
}
resource "openstack_compute_instance_v2" "aklimov8152as2" {
name = "aklimov8152as2"
region = "HetznerNBG4"
flavor_name = "s1.xxlarge"
block_device {
uuid = "${var.openstack_image}"
source_type = "image"
boot_index = 0
destination_type = "volume"
volume_size = 50
delete_on_termination = true
}
network {
name = "${var.tenant_network}"
}
security_groups = [ "default2" ]
key_pair = "${var.openstack_keypair}"
}
resource "openstack_compute_instance_v2" "aklimov8152bm1" {
name = "aklimov8152bm1"
region = "HetznerNBG4"
flavor_name = "s1.xxlarge"
block_device {
uuid = "${var.openstack_image}"
source_type = "image"
boot_index = 0
destination_type = "volume"
volume_size = 50
delete_on_termination = true
}
network {
name = "${var.tenant_network}"
}
security_groups = [ "default2" ]
key_pair = "${var.openstack_keypair}"
}
resource "openstack_compute_instance_v2" "aklimov8152bm2" {
name = "aklimov8152bm2"
region = "HetznerNBG4"
flavor_name = "s1.xxlarge"
block_device {
uuid = "${var.openstack_image}"
source_type = "image"
boot_index = 0
destination_type = "volume"
volume_size = 50
delete_on_termination = true
}
network {
name = "${var.tenant_network}"
}
security_groups = [ "default2" ]
key_pair = "${var.openstack_keypair}"
}
resource "openstack_compute_instance_v2" "aklimov8152bs1" {
name = "aklimov8152bs1"
region = "HetznerNBG4"
flavor_name = "s1.xxlarge"
block_device {
uuid = "${var.openstack_image}"
source_type = "image"
boot_index = 0
destination_type = "volume"
volume_size = 50
delete_on_termination = true
}
network {
name = "${var.tenant_network}"
}
security_groups = [ "default2" ]
key_pair = "${var.openstack_keypair}"
}
resource "openstack_compute_instance_v2" "aklimov8152bs2" {
name = "aklimov8152bs2"
region = "HetznerNBG4"
flavor_name = "s1.xxlarge"
block_device {
uuid = "${var.openstack_image}"
source_type = "image"
boot_index = 0
destination_type = "volume"
volume_size = 50
delete_on_termination = true
}
network {
name = "${var.tenant_network}"
}
security_groups = [ "default2" ]
key_pair = "${var.openstack_keypair}"
}
pl.yml
---
- import_playbook: prepare.yml
- import_playbook: dns.yml
- import_playbook: squid.yml
- import_playbook: pkg.yml
- import_playbook: influx.yml
- import_playbook: grafana.yml
- import_playbook: i2mon.yml
- import_playbook: i2.yml
prepare.yml
---
- hosts: aklimov8152h
become: yes
become_method: sudo
tasks:
- name: /etc/resolv.conf
copy:
dest: /etc/resolv.conf
content: |
nameserver 9.9.9.9
- name: apt update
apt:
update_cache: yes
dns.yml
---
- hosts: all
become: yes
become_method: sudo
tasks: []
- hosts: aklimov8152h
become: yes
become_method: sudo
tasks:
- name: dnsmasq
apt:
name: dnsmasq
- name: /etc/dnsmasq.conf
blockinfile:
path: /etc/dnsmasq.conf
marker: '# {mark} general'
block: |
no-resolv
no-hosts
interface=eth0
server=9.9.9.9
notify: Restart dnsmasq
- name: /etc/dnsmasq.conf
with_inventory_hostnames: all
blockinfile:
path: /etc/dnsmasq.conf
marker: '# {mark} {{ item }}'
block: |
address=/{{ item }}/{{ hostvars[item].ansible_default_ipv4.address }}
notify: Restart dnsmasq
handlers:
- name: Restart dnsmasq
service:
name: dnsmasq
state: restarted
- hosts: 'all:!aklimov8152h'
become: yes
become_method: sudo
tasks:
- name: /etc/resolv.conf
copy:
dest: /etc/resolv.conf
content: |
nameserver {{ hostvars['aklimov8152h'].ansible_default_ipv4.address }}
squid.yml
---
- hosts: all
become: yes
become_method: sudo
tasks: []
- hosts: aklimov8152h
become: yes
become_method: sudo
tasks:
- name: squid-deb-proxy
apt:
name: squid-deb-proxy
- name: /etc/squid-deb-proxy/allowed-networks-src.acl.d/99-*
with_inventory_hostnames: all
copy:
dest: '/etc/squid-deb-proxy/allowed-networks-src.acl.d/99-{{ item }}'
owner: root
group: root
mode: '0644'
content: |
{{ hostvars[item].ansible_default_ipv4.address }}/32
notify: Restart squid-deb-proxy
- name: /etc/squid-deb-proxy/mirror-dstdomain.acl.d/99-*
loop:
- aklimov8152h
- packages.grafana.com
copy:
dest: '/etc/squid-deb-proxy/mirror-dstdomain.acl.d/99-{{ item }}'
owner: root
group: root
mode: '0644'
content: |
{{ item }}
notify: Restart squid-deb-proxy
handlers:
- name: Restart squid-deb-proxy
service:
name: squid-deb-proxy
state: restarted
- hosts: all
become: yes
become_method: sudo
tasks:
- name: /etc/apt/apt.conf.d/01proxy
copy:
dest: /etc/apt/apt.conf.d/01proxy
owner: root
group: root
mode: '0644'
content: |
Acquire::http { Proxy "http://aklimov8152h:8000"; };
pkg.yml
---
- hosts: aklimov8152h
become: yes
become_method: sudo
tasks:
- name: nginx
apt:
name: nginx
- name: /var/www/html/*/
loop:
- wojm
- wjm
copy:
dest: '/var/www/html/{{ item }}/'
owner: root
group: root
mode: '0755'
src: '{{ item }}/'
- name: /var/www/html/*/
loop:
- influx
file:
path: '/var/www/html/{{ item }}'
owner: root
group: root
mode: '0755'
state: directory
- name: /var/www/html/influx/influx.deb
get_url:
dest: /var/www/html/influx/influx.deb
owner: root
group: root
mode: '0644'
url: 'https://dl.influxdata.com/influxdb/releases/influxdb_1.8.4_amd64.deb'
checksum: sha256:ad4058db83f424dad21337f3d7135de921498b652d67e2fcd2e2e070d2997a2d
- name: dpkg-dev
apt:
name: dpkg-dev
- name: /var/www/html/*/Packages
loop:
- wojm
- wjm
- influx
shell: dpkg-scanpackages . /dev/null >Packages
args:
chdir: '/var/www/html/{{ item }}'
creates: Packages.gz
- name: Influx repo
copy:
dest: /etc/apt/sources.list.d/influx.list
owner: root
group: root
mode: '0644'
content: |
deb [trusted=yes] file:///var/www/html/influx/ ./
- name: apt update
apt:
update_cache: yes
- hosts: 'aklimov8152a*'
become: yes
become_method: sudo
tasks:
- name: aklimov8152h repo
copy:
dest: /etc/apt/sources.list.d/wojm.list
owner: root
group: root
mode: '0644'
content: |
deb [trusted=yes] http://aklimov8152h/wojm/ ./
- name: apt update
apt:
update_cache: yes
- hosts: 'aklimov8152b*'
become: yes
become_method: sudo
tasks:
- name: aklimov8152h repo
copy:
dest: /etc/apt/sources.list.d/wjm.list
owner: root
group: root
mode: '0644'
content: |
deb [trusted=yes] http://aklimov8152h/wjm/ ./
- name: apt update
apt:
update_cache: yes
influx.yml
---
- hosts: aklimov8152h
become: yes
become_method: sudo
tasks:
- name: influxdb
apt:
name: influxdb
- name: InfluxDB service
service:
name: influxdb
state: started
enabled: yes
- name: python-influxdb
apt:
name: python-influxdb
- influxdb_database:
database_name: icinga2
grafana.yml
- hosts: aklimov8152h
become: yes
become_method: sudo
tasks:
- name: apt-transport-https
apt:
name: apt-transport-https
- name: gpg
apt:
name: gpg
- name: Grafana repo key
apt_key:
url: https://packages.grafana.com/gpg.key
- name: Grafana repo
apt_repository:
repo: deb https://packages.grafana.com/oss/deb stable main
- name: Grafana
apt:
name: grafana
- name: Grafana service
service:
name: grafana-server
state: started
enabled: yes
i2mon.yml
---
- hosts: aklimov8152h
become: yes
become_method: sudo
tasks:
- name: icinga2-bin
apt:
name: icinga2-bin
- name: Icinga SSH key
community.crypto.openssh_keypair:
path: /var/lib/icinga2/id_rsa
owner: nagios
- name: Fetch Icinga SSH key
fetch:
dest: .tempfiles
src: /var/lib/icinga2/id_rsa.pub
- hosts: 'all:!aklimov8152h'
become: yes
become_method: sudo
tasks:
- name: User chkbyssh
user:
name: chkbyssh
system: yes
- name: Icinga SSH key
authorized_key:
user: chkbyssh
key: |-
{{ lookup('file', '.tempfiles/aklimov8152h/var/lib/icinga2/id_rsa.pub') }}
- name: monitoring-plugins
apt:
name: monitoring-plugins
- name: git
apt:
name: git
- name: check_mem repo
git:
dest: /opt/justintime-plugins
repo: 'https://github.com/justintime/nagios-plugins.git'
version: 91c4dc366ba9c132194eda62e5358e514d3aae36
- name: check_mem
file:
path: /usr/lib/nagios/plugins/check_mem.pl
state: link
src: /opt/justintime-plugins/check_mem/check_mem.pl
- hosts: aklimov8152h
become: yes
become_method: sudo
tasks:
- name: monitoring-plugins
apt:
name: monitoring-plugins
- name: icinga2 node setup
shell: >-
icinga2 node setup
--zone master
--listen 0.0.0.0,5665
--cn {{ inventory_hostname }}
--master
--disable-confd
args:
creates: /var/lib/icinga2/certs/ca.crt
notify: Restart Icinga 2
- name: /etc/icinga2/zones.d/master
file:
path: /etc/icinga2/zones.d/master
owner: root
group: root
mode: '0755'
state: directory
- name: /etc/icinga2/zones.d/master/*.conf
with_inventory_hostnames: 'all:!aklimov8152h'
copy:
dest: '/etc/icinga2/zones.d/master/{{ item }}.conf'
owner: root
group: root
mode: '0644'
content: |
object Host "{{ item }}" {
address = "{{ hostvars[item].ansible_default_ipv4.address }}"
}
notify: Restart Icinga 2
- name: /etc/icinga2/zones.d/master/misc.conf
copy:
dest: /etc/icinga2/zones.d/master/misc.conf
owner: root
group: root
mode: '0644'
src: i2mon.conf
notify: Restart Icinga 2
- name: /etc/icinga2/features-available/influxdb.conf
copy:
dest: /etc/icinga2/features-available/influxdb.conf
owner: root
group: root
mode: '0644'
src: influxdb.conf
notify: Restart Icinga 2
- name: /etc/icinga2/features-enabled/influxdb.conf
file:
path: /etc/icinga2/features-enabled/influxdb.conf
state: link
src: /etc/icinga2/features-available/influxdb.conf
notify: Restart Icinga 2
handlers:
- name: Restart Icinga 2
service:
name: icinga2
state: restarted
i2.yml
---
- hosts: 'all:!aklimov8152h'
become: yes
become_method: sudo
tasks:
- name: icinga2-bin
apt:
name: icinga2-bin
- hosts: 'aklimov8152*m1'
become: yes
become_method: sudo
vars:
i2sats:
aklimov8152am1:
- aklimov8152am2
- aklimov8152as1
- aklimov8152as2
aklimov8152bm1:
- aklimov8152bm2
- aklimov8152bs1
- aklimov8152bs2
tasks:
- name: icinga2 node setup
shell: >-
icinga2 node setup
--zone master
--listen 0.0.0.0,5665
--cn {{ inventory_hostname }}
--master
--disable-confd;
rm -f /var/cache/icinga2/icinga2.vars
args:
creates: /var/lib/icinga2/certs/ca.crt
notify: Restart Icinga 2
- name: /var/cache/icinga2/icinga2.vars
shell: icinga2 daemon -C
args:
creates: /var/cache/icinga2/icinga2.vars
- name: Icinga 2 ticket
loop: '{{ i2sats[inventory_hostname] }}'
shell: >-
icinga2 pki ticket --cn {{ item }}
>/var/cache/icinga2/{{ item }}.ticket
args:
creates: '/var/cache/icinga2/{{ item }}.ticket'
- name: Fetch Icinga 2 ticket
loop: '{{ i2sats[inventory_hostname] }}'
fetch:
dest: .tempfiles
src: '/var/cache/icinga2/{{ item }}.ticket'
- name: Fetch Icinga 2 CA cert
fetch:
dest: .tempfiles
src: /var/lib/icinga2/certs/ca.crt
- name: Fetch Icinga 2 CA key
fetch:
dest: .tempfiles
src: /var/lib/icinga2/ca/ca.key
- name: Zone dirs
loop:
- global
- m
- s1
- s2
file:
path: '/etc/icinga2/zones.d/{{ item }}'
owner: root
group: root
mode: '0755'
state: directory
- name: /etc/icinga2/zones.d/global/*.conf
loop:
- templates
- applys
copy:
dest: '/etc/icinga2/zones.d/global/{{ item }}.conf'
owner: root
group: root
mode: '0644'
src: '{{ item }}.conf'
notify: Restart Icinga 2
- name: Hosts
loop:
- m
- s1
- s2
copy:
dest: '/etc/icinga2/zones.d/{{ item }}/hosts.conf'
owner: root
group: root
mode: '0644'
src: '{{ item }}.conf'
notify: Restart Icinga 2
handlers:
- name: Restart Icinga 2
service:
name: icinga2
state: restarted
- hosts: 'all:!aklimov8152h:!aklimov8152*m1'
become: yes
become_method: sudo
vars:
i2masters:
aklimov8152am2: aklimov8152am1
aklimov8152as1: aklimov8152am1
aklimov8152as2: aklimov8152am1
aklimov8152bm2: aklimov8152bm1
aklimov8152bs1: aklimov8152bm1
aklimov8152bs2: aklimov8152bm1
tasks:
- name: /var/cache/icinga2/my.ticket
copy:
dest: /var/cache/icinga2/my.ticket
owner: nagios
group: nagios
mode: '0600'
src: '.tempfiles/{{ i2masters[inventory_hostname] }}/var/cache/icinga2/{{ inventory_hostname }}.ticket'
- name: icinga2 node setup
shell: >
icinga2 node setup
--zone {{ inventory_hostname }}
--endpoint {{ i2masters[inventory_hostname] }},{{ i2masters[inventory_hostname] }},5665
--parent_zone master
--listen 0.0.0.0,5665
--ticket `cat /var/cache/icinga2/my.ticket`
--cn {{ inventory_hostname }}
--accept-config
--accept-commands
--disable-confd
args:
creates: /var/lib/icinga2/certs
notify: Restart Icinga 2
- name: /var/lib/icinga2/certs/ca.crt
copy:
dest: /var/lib/icinga2/certs/ca.crt
owner: nagios
group: nagios
mode: '0644'
src: .tempfiles/{{ i2masters[inventory_hostname] }}/var/lib/icinga2/certs/ca.crt
handlers:
- name: Restart Icinga 2
service:
name: icinga2
state: restarted
- hosts: 'all:!aklimov8152h'
become: yes
become_method: sudo
tasks:
- name: /etc/icinga2/zones.conf
copy:
dest: /etc/icinga2/zones.conf
owner: root
group: root
mode: '0644'
src: zones.conf
notify: Restart Icinga 2
handlers:
- name: Restart Icinga 2
service:
name: icinga2
state: restarted
applys.conf
for (i in range(150000)) {
apply Service i {
check_command = "dummy"
assign where true
}
}
apply ScheduledDowntime "sd" to Service {
author = "me"
comment = "mine"
ranges = { "monday - sunday" = "02:00-22:00" }
assign where true
}
i2mon.conf
template Host default {
check_command = "passive"
enable_active_checks = false
}
template Service default {
check_interval = 1s
retry_interval = check_interval
}
template Service "by_ssh" {
vars.original_check_command = check_command
check_command = "by_ssh"
vars.by_ssh_command = {{ get_check_command(service.vars.original_check_command).command }}
vars.by_ssh_arguments = {{ get_check_command(service.vars.original_check_command).arguments }}
vars.by_ssh_logname = "chkbyssh"
vars.by_ssh_identity = "/var/lib/icinga2/id_rsa"
vars.by_ssh_options = "StrictHostKeyChecking=no"
}
apply Service "load" {
check_command = "load"
import "by_ssh"
vars.load_percpu = true
assign where true
}
apply Service "mem" {
check_command = "mem"
import "by_ssh"
vars.mem_used = true
vars.mem_cache = true
vars.mem_warning = 80
vars.mem_critical = 90
assign where true
}
influxdb.conf
object InfluxdbWriter "influxdb" {
host = "aklimov8152h"
port = 8086
database = "icinga2"
flush_threshold = 1024
flush_interval = 10s
host_template = {
measurement = "$host.check_command$"
tags = {
hostname = "$host.name$"
}
}
service_template = {
measurement = "$service.check_command$"
tags = {
hostname = "$host.name$"
service = "$service.name$"
}
}
}
m.conf
var prefix = NodeName.substr(0, "aklimov8152x".len())
object Host prefix + "m1" { }
object Host prefix + "m2" { }
s1.conf
object Host NodeName.substr(0, "aklimov8152x".len()) + "s1" { }
s2.conf
object Host NodeName.substr(0, "aklimov8152x".len()) + "s2" { }
templates.conf
template Host default {
check_command = "passive"
enable_active_checks = false
}
zones.conf
var prefix = NodeName.substr(0, "aklimov8152x".len())
for (i in ["1", "2"]) {
for (lvl in ["m", "s"]) {
object Endpoint prefix + lvl + i {
host = name
}
}
object Zone "s" + i use(prefix) {
parent = "m"
endpoints = [ prefix + name ]
}
}
object Zone "m" use(prefix) {
endpoints = [ prefix + "m1", prefix + "m2" ]
}
object Zone "global" {
global = true
}
Results
glibc
master1
master2
sat1
sat2
jemalloc
master1
master2
sat1
sat2
See also https://github.com/Icinga/icinga2/issues/8737#issuecomment-828378514 .
Which screenshots belong to which malloc? Please add headings. Also, a summary of what you found out after the test would be great.
Conclusion: jemalloc even reduces memory usage.
@cla-bot check
@N-o-X Aren't you testing a real world config atm? Please could you test this PR before/after w/ the config?
Apropos large configs: do you all agree that we just don’t need this on Raspbian for obvious reasons?
Apropos large configs: do you all agree that we just don’t need this on Raspbian for obvious reasons?
Yes.
@N-o-X Aren't you testing a real world config atm? Please could you test this PR before/after w/ the config?
The results are kinda bad.
Setup
VM:
- 16GB RAM
- 8 Cores
Config:
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 4 NotificationCommands.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 72985 Notifications.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 24 Dependencies.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 1 IcingaApplication.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 24400 HostGroups.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 28007 Hosts.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 2 EventCommands.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 7036 Downtimes.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 116 Comments.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 1 FileLogger.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 1 IcingaDB.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 1 ApiListener.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 215 Zones.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 207 Endpoints.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 11 ApiUsers.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 507 CheckCommands.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 713 Users.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 26 TimePeriods.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 88 ServiceGroups.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 2599 ScheduledDowntimes.
[2021-11-26 12:43:28 +0000] information/ConfigItem: Instantiated 383318 Services.
Tests
Command: time icinga2 daemon -C
Results 2.13.2
Run 1
real 12m32.680s user 62m25.203s sys 17m40.012s
Run 2
real 12m50.119s user 63m10.677s sys 17m24.793s
Results 2.13.2 + This PR
Run 1
real 14m40.973s user 68m20.610s sys 24m0.484s
Run 2
real 14m35.091s user 68m40.074s sys 22m32.007s
Good to know: https://github.com/Icinga/icinga2/issues/8737#issuecomment-1000551057
One more reason: https://github.com/Icinga/icinga2/issues/8737#issuecomment-1046637022
I did some playing with jemalloc 5.2.1 (using LD_PRELOAD
) and a somewhat larger config modeled after some real world config (daemon -C
takes around 20s on my laptop). With this, I got around a 12% performance benefit at 4% more memory usage (about 1GB peak, so about 40MB extra). Take the exact numbers with a grain of salt but they looked consistent enough across 10 runs that I think it's fair to say that in this scenario you get a nice performance benefit for a small penalty in memory use.
Full Config
const numHosts = 10000;
const numHostTemplates = 100;
const numHostGroups = 100;
const maxGroupsPerHost = 4;
const maxTemplatesPerHost = 2;
const hostTemplateApplyRules = 100;
const hostVarEqApplyRules = 10;
const hostNameMatchApplyRules = 90;
const hostNotMatchingApplyRules = 150;
const numExtraNotificationRules = 10;
object CheckCommand "dummy" {
command = ["true"]
}
object NotificationCommand "dummy" {
command = ["true"]
}
for (i in range(numHostGroups)) {
object HostGroup String(i) {}
}
for (i in range(numHostTemplates)) {
var t = "template_"+String(i)
template Host t use (t) {
vars[t] = 42
}
}
for (i in range(numHosts)) {
object Host String(i) use (i) {
check_command = "dummy"
for (j in range(1 + i % maxGroupsPerHost)) {
groups += [String((7*i + 11*j) % numHostGroups)]
}
for (j in range(1 + i % maxTemplatesPerHost)) {
import "template_" + String((11*i + 13*j) % numHostTemplates)
}
vars.v0 = "foo"
vars.v1 = "foo"
vars.v2 = "foo"
vars.v3 = "foo"
vars.v4 = "foo"
vars.v5 = "foo"
vars.v6 = "foo"
vars.v7 = "foo"
vars.v8 = "foo"
vars.v9 = "foo"
}
object Endpoint String(i) {}
object Zone String(i) use (i) {
endpoints = [String(i)]
}
}
for (i in range(hostTemplateApplyRules)) {
var t = "template_" + String(i%numHostTemplates)
apply Service t use (t) {
check_command = "dummy"
assign where t in host.templates
}
}
for (i in range(hostVarEqApplyRules)) {
var t = "var_eq_" + String(i)
apply Service t use (i, t) {
check_command = "dummy"
assign where host.vars["template_" + String(i % numHostTemplates)] == 42
}
}
for (i in range(hostNameMatchApplyRules)) {
var t = "name_match+" + String(i)
var p = String(i%10) + "*"
apply Service t use (t, p) {
check_command = "dummy"
assign where match(p, host.name)
}
}
for (i in range(hostNotMatchingApplyRules)) {
var t = "no_match_" + String(i)
var p = String(i%10) + "*"
apply Service t use (t, p) {
check_command = "dummy"
assign where match("*", host.name) && host.vars.var1 == "value-never-set"
}
}
object User "user" {}
apply Notification "all" to Host {
command = "dummy"
users = ["user"]
assign where true
}
apply Notification "all" to Service {
command = "dummy"
users = ["user"]
assign where true
}
for (i in range(numExtraNotificationRules)) {
apply Notification "extra-" + String(i) to Host {
command = "dummy"
users = ["user"]
assign where host.name.len() == 1
}
apply Notification "extra-" + String(i) to Service {
command = "dummy"
users = ["user"]
assign where host.name.len() == 1
}
}
Something else to consider: looks like the glibc allocator is better at detecting bad stuff:
$ g++ -std=c++11 -o double-free double-free.cpp
$ g++ -std=c++11 -o heap-overflow heap-overflow.cpp
$ ./double-free
free(): double free detected in tcache 2
[1] 1194835 IOT instruction (core dumped) ./double-free
$ ./heap-overflow
munmap_chunk(): invalid pointer
[1] 1194847 IOT instruction (core dumped) ./heap-overflow
$ jemalloc.sh ./double-free
$ echo $?
0
$ jemalloc.sh ./heap-overflow
$ echo $?
0
double-free.cpp
int main() {
auto x = new int[1];
delete[] x;
delete[] x;
}
heap-overflow.cpp
#include <vector>
int main() {
std::vector<int> x(1), y(1);
for (int i = 0; i < 8192; i++) {
x[i] = 42;
y[i] = 42;
}
}
I opt for just closing this one. We should invest the time it takes to test and verify if and how this affects performance into actually improving the code.
I'm on 2.13.3
:
icinga2 - The Icinga 2 network monitoring daemon (version: r2.13.3-1)
Copyright (c) 2012-2022 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <https://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
System information:
Platform: CentOS Linux
Platform version: 7 (Core)
Kernel: Linux
Kernel version: 3.10.0-1127.8.2.el7.x86_64
Architecture: x86_64
Build information:
Compiler: GNU 4.8.5
Build host: runner-hh8q3bz2-project-322-concurrent-0
OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
Application information:
General paths:
Config directory: /etc/icinga2
Data directory: /var/lib/icinga2
Log directory: /var/log/icinga2
Cache directory: /var/cache/icinga2
Spool directory: /var/spool/icinga2
Run directory: /run/icinga2
Old paths (deprecated):
Installation root: /usr
Sysconf directory: /etc
Run directory (base): /run
Local state directory: /var
Internal paths:
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid
Fairly sizeable deployment and suffering from memory leaks which led me here.
Not sure if this is helpful, but I added LD_PRELOAD=/usr/lib64/libjemalloc.so.1
to /etc/sysconfig/icinga2
pre config change test for icinga2 daemon -C
real 1m16.896s
user 2m34.925s
sys 0m16.074s
post config change test for icinga2 daemon -C
real 1m5.757s
user 2m35.747s
sys 0m18.760s
Should have some solid memory graphs in the morning
Added
fixes #8737
to OP.