upg-vpp
upg-vpp copied to clipboard
Steadily increasing memory usage on version 1.2.5
We see a slow but steady increase of memory usage on UPGs running version 1.2.5.
2022-04-11T22:00Z: 1.164 GB 2022-05-20T00:00Z: 1.479 GB
The number of sessions did not change significantly during that increase:
Output of show memory map
of one of the affected instances:
StartAddr size FD PageSz Pages Numa0 Numa1 NotMap Name
00007fa7ad0d1000 1m 4K 256 2 0 254 main heap
00007fa6ed0d0000 3g 4K 786432 11430405812 369177 main heap
00007fa6e9862000 2m 4K 512 2 1 509 thread stack: thread 0
00007fa6c9861000 512m 4 4K 131072 4 1600 0 stat segment
00007fa6c9858000 32k 4K 8 0 2 6 process stack: wg-timer-manager
00007fa6c984f000 32k 4K 8 0 2 6 process stack: vrrp-periodic-process
00007fa6c9846000 32k 4K 8 0 2 6 process stack: pfcp-server-process
00007fa6c9835000 64k 4K 16 0 3 13 process stack: pfcp-api
00007fa6c982c000 32k 4K 8 0 2 6 process stack: perfmon-periodic-process
00007fa6c9823000 32k 4K 8 0 2 6 process stack: nsh-md2-ioam-export-process
00007fa6c981a000 32k 4K 8 0 2 6 process stack: nat-ha-process
00007fa6c9811000 32k 4K 8 0 2 6 process stack: memif-process
00007fa6c9808000 32k 4K 8 0 2 6 process stack: lldp-process
00007fa6c97ff000 32k 4K 8 0 2 6 process stack: udp-ping-process
00007fa6c97f6000 32k 4K 8 0 2 6 process stack: vxlan-gpe-ioam-export-process
00007fa6c97ed000 32k 4K 8 0 2 6 process stack: ioam-export-process
00007fa6c97e4000 32k 4K 8 0 2 6 process stack: ikev2-manager-process
00007fa6c97db000 32k 4K 8 0 2 6 process stack: igmp-timer-process
00007fa6c97d2000 32k 4K 8 0 1 7 process stack: static-http-server-process
00007fa6c97c9000 32k 4K 8 0 1 7 process stack: http-server-process
00007fa6c97c0000 32k 4K 8 0 2 6 process stack: gbp-scanner
00007fa6c97b7000 32k 4K 8 0 2 6 process stack: flowprobe-timer-process
00007fa6c97ae000 32k 4K 8 0 2 6 process stack: send-dhcp6-pd-client-message-process
00007fa6c97a5000 32k 4K 8 0 2 6 process stack: dhcp6-pd-client-cp-process
00007fa6c979c000 32k 4K 8 0 2 6 process stack: dhcp6-client-cp-process
00007fa6c9793000 32k 4K 8 0 2 6 process stack: send-dhcp6-client-message-process
00007fa6c978a000 32k 4K 8 0 2 6 process stack: dhcp6-pd-reply-publisher-process
00007fa6c9781000 32k 4K 8 0 2 6 process stack: dhcp6-reply-publisher-process
00007fa6c9770000 64k 4K 16 0 2 14 process stack: dhcp-client-process
00007fa6c9767000 32k 4K 8 0 2 6 process stack: cnat-scanner-process
00007fa6c975e000 32k 4K 8 0 2 6 process stack: avf-process
00007fa6c9755000 32k 4K 8 0 2 6 process stack: acl-plugin-fa-cleaner-process
00007fa6c974c000 32k 4K 8 0 2 6 process stack: statseg-collector-process
00007fa6c970b000 256k 4K 64 0 3 61 process stack: api-rx-from-ring
00007fa6c9702000 32k 4K 8 0 2 6 process stack: rd-cp-process
00007fa6c96f9000 32k 4K 8 0 2 6 process stack: ip6-ra-process
00007fa6c96f0000 32k 4K 8 0 2 6 process stack: ip6-rs-process
00007fa6c96e7000 32k 4K 8 0 2 6 process stack: ip6-mld-process
00007fa6c96de000 32k 4K 8 0 2 6 process stack: fib-walk
00007fa6c96d5000 32k 4K 8 0 1 7 process stack: session-queue-process
00007fa6c96cc000 32k 4K 8 0 2 6 process stack: virtio-send-interrupt-process
00007fa6c96c3000 32k 4K 8 0 2 6 process stack: vhost-user-process
00007fa6c96ba000 32k 4K 8 0 2 6 process stack: vhost-user-send-interrupt-process
00007fa6c96b1000 32k 4K 8 0 2 6 process stack: flow-report-process
00007fa6c96a8000 32k 4K 8 0 2 6 process stack: bfd-process
00007fa6c969f000 32k 4K 8 0 2 6 process stack: ip-neighbor-event
00007fa6c9696000 32k 4K 8 0 2 6 process stack: ip6-neighbor-age-process
00007fa6c968d000 32k 4K 8 0 3 5 process stack: ip4-neighbor-age-process
00007fa6c9684000 32k 4K 8 0 2 6 process stack: ip6-sv-reassembly-expire-walk
00007fa6c967b000 32k 4K 8 0 2 6 process stack: ip6-full-reassembly-expire-walk
00007fa6c9672000 32k 4K 8 0 2 6 process stack: ip4-sv-reassembly-expire-walk
00007fa6c9669000 32k 4K 8 0 2 6 process stack: ip4-full-reassembly-expire-walk
00007fa6c9660000 32k 4K 8 0 2 6 process stack: bond-process
00007fa6c9657000 32k 4K 8 0 2 6 process stack: l2fib-mac-age-scanner-process
00007fa6c964e000 32k 4K 8 0 2 6 process stack: l2-arp-term-publisher
00007fa6c9645000 32k 4K 8 0 2 6 process stack: vpe-link-state-process
00007fa6c9604000 256k 4K 64 0 3 61 process stack: startup-config-process
00007fa6c95c3000 256k 4K 64 0 2 62 process stack: unix-cli-stdin
0000002000001000 64m 10 4K 16384 0 17 0 session: evt-qs-segment
00007fa6c75c1000 32.00m 4K 8193 0 4 8189 tls segment
00007fa6a75bf000512.00m 4K 131073 0 4 131069 upf-proxy-server segment
00007fa69f5bd000128.00m 4K 32769 0 4 32765 upf-proxy-active-open segment
00007fa69b799000 1.00m 4K 257 0 86 171 upf-pfcp-server segment
00007fa69b744000 256k 4K 64 0 3 61 process stack: unix-cli-127.0.0.1:38520
00007fa69b733000 64k 4K 16 0 2 14 process stack: unix-cli-new-session
00007fa69b6f2000 256k 4K 64 0 2 62 process stack: unix-cli-127.0.0.1:39170
Output of show memory main-heap
Thread 0 vpp_main
base 0x7fa6ed0d0000, size 3g, locked, unmap-on-destroy, name 'main heap'
page stats: page-size 4K, total 786432, mapped 417243, not-mapped 369176, unknown 13
numa 0: 11430 pages, 44.65m bytes
numa 1: 405813 pages, 1.55g bytes
total: 2.99G, used: 1.46G, free: 1.54G, trimmable: 1.48G
I suggest that we run show memory main-heap
a couple of days later and see if the reported heap usage is increasing. If it isn't, then the problem can be explained by internal heap usage patterns and likely will not cause any problems (VPP doesn't free the pages allocated for its heap when the parts of heap are freed internally; but this memory will be reused).
If heap usage will be growing, we might have to resort to memory tracing.
@ivan4th side-question: is main-heap available as metric exposed via vpp/upg-agent?
@ivan4th new show memory main-heap
:
Thread 0 vpp_main
base 0x7fa6ed0d0000, size 3g, locked, unmap-on-destroy, name 'main heap'
page stats: page-size 4K, total 786432, mapped 426532, not-mapped 359887, unknown 13
numa 0: 11430 pages, 44.65m bytes
numa 1: 415102 pages, 1.58g bytes
total: 2.99G, used: 1.53G, free: 1.47G, trimmable: 1.44G
Since upg-vpp:1.9.0 is released and running stable: i am closing this issue