xcp
xcp copied to clipboard
Dom0 memory gets full with higher number of VMs due to netdata
Host details - XCP-ng 8.1 VM details - alpine minimal SR details - iscsi SR Activity - Start & Stop VM for 1500 times
# free -h
total used free shared buff/cache available
Mem: 1.6G 1.4G 14M 440K 204M 97M
Swap: 1.0G 315M 708M
top - 06:29:05 up 17:53, 1 user, load average: 0.28, 1.02, 1.51
Tasks: 180 total, 1 running, 111 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.9 us, 5.4 sy, 0.0 ni, 92.0 id, 0.2 wa, 0.0 hi, 0.2 si, 0.2 st
KiB Mem : 1671900 total, 16180 free, 1447236 used, 208484 buff/cache
KiB Swap: 1048572 total, 724524 free, 324048 used. 100044 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
25927 netdata 20 0 61872 6520 3448 S 4.9 0.4 6:51.73 /usr/libexec/netdata/plugins.d/apps.plugin 1
10072 root 20 0 162016 4520 3764 R 3.9 0.3 0:00.39 top -c -d1
1551 netdata 20 0 1505256 1.0g 5660 S 2.9 64.1 28:40.70 /usr/sbin/netdata -P /var/run/netdata/netdata.pid -D -W set global process scheduling po+
1419 root 20 0 118020 9420 3584 S 1.9 0.6 2:23.24 /opt/xensource/libexec/xcp-rrdd-plugins/xcp-rrdd-iostat
398 root 20 0 30884 2364 2232 S 1.0 0.1 9:42.04 /usr/lib/systemd/systemd-journald
# xl list
Name ID Mem VCPUs State Time(s)
Domain-0 0 1840 4 r----- 65247.9
SVE Alpine minimal for exports 1414 512 1 -b---- 18.1
If netdata service is stopped, there is no exhaustion of Dom0 memory.
But we don't have Netdata enabled by default in hosts, I don't think it's covering the initial issue problem (kernel memory leak), right? (so it's a completely different issue)
edit: in this case, we might try to use a more recent version of Netdata.
Yes @olivierlambert, it's a finding while looking for the possible reasons for kernel memory leak. I'll check if recent netdata solves this issue.