frr icon indicating copy to clipboard operation
frr copied to clipboard

Too much memory allocated in FRR version 7

Open netfreak98 opened this issue 6 years ago • 17 comments

Our FRR (stable v6) implementation uses more and more memory from time to time with the same configuration using Debian 9 Stretch:

bash-4.4# systemctl status frr
● frr.service - FRRouting
   Loaded: loaded (/etc/systemd/system/frr.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2019-02-07 12:09:34 CET; 1 months 3 days ago
     Docs: https://frrouting.readthedocs.io/en/latest/setup.html
    Tasks: 9 (limit: 11059)
   Memory: 13.8G
   CGroup: /system.slice/frr.service
           ├─44634 /usr/lib/frr/watchfrr -d zebra bgpd staticd
           ├─44650 /usr/lib/frr/zebra -d
           ├─44653 /usr/lib/frr/bgpd -d
           └─44660 /usr/lib/frr/staticd -d
vRouter# show memory 
Memory statistics for zebra:
System allocator statistics:
  Total heap allocated:  2824 KiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  2201 KiB
  Free small blocks:     2464 bytes
  Free ordinary blocks:  623 KiB
  Ordinary blocks:       7
  Small blocks:          65
  Holding blocks:        0
(see system documentation for 'mallinfo' for meaning)
--- qmem libfrr ---
Buffer                        :          7      24                 168
Buffer data                   :          1    4120                4120
Host config                   :          2  (variably sized)        48
Command Tokens                :       3162      72              228096
Command Token Text            :       2362  (variably sized)     77344
Command Token Help            :       2362  (variably sized)     56976
Command Argument              :          2  (variably sized)        48
Command Argument Name         :        534  (variably sized)     12928
FRR POSIX Thread              :          6  (variably sized)       432
POSIX synchronization primitives:          6  (variably sized)       288
Graph                         :         24       8                 576
Graph Node                    :       3744      32              150128
Hash                          :        871  (variably sized)     42024
Hash Bucket                   :       1614      32               65856
Hash Index                    :        436  (variably sized)    378224
Hook entry                    :         12      48                 672
Interface                     :        112     248               27840
Connected                     :         95      40                4200
Link List                     :        386      40               15520
Link Node                     :        567      24               13608
Logging                       :          1      80                  88
Temporary memory              :        120  (variably sized)     42192
Nexthop                       :        198     112               23856
NetNS Context                 :          2  (variably sized)       128
NetNS Name                    :          1      18                  24
Priority queue                :          4      32                 160
Priority queue data           :          4     256                1056
Prefix                        :         97      48                5544
Stream                        :          7  (variably sized)    114968
Stream FIFO                   :          6      64                 432
Route table                   :        123      48                6888
Route node                    :        796  (variably sized)     84416
Thread                        :         24     176                4416
Thread master                 :         15  (variably sized)     66920
Thread Poll Info              :          8    8192               65600
Thread stats                  :         18      64                1312
Vector                        :       7543      16              181384
Vector index                  :       7543  (variably sized)    246920
VRF                           :          1     184                 184
VRF bit-map                   :          4       8                  96
VTY                           :          6  (variably sized)     19440
Work queue                    :          2  (variably sized)       224
Work queue name string        :          1      22                  24
--- qmem Label Manager ---
--- qmem zebra ---
ZEBRA VRF                     :          1     656                 664
Route Entry                   :        198      80               17632
RIB destination               :         12      48                 672
RIB table info                :          4      16                  96
Nexthop tracking object       :          4     200                 800
Zebra Name Space              :          1     312                 312
VNI hash                      :          7      48                 440
VNI remote VTEP               :         29      24                 696
VNI MAC                       :         24      40                 960
VNI Neighbor                  :         12      56                 672
--- qmem Table Manager ---

Memory statistics for bgpd:
System allocator statistics:
  Total heap allocated:  1844 MiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  1798 MiB
  Free small blocks:     1760 bytes
  Free ordinary blocks:  46 MiB
  Ordinary blocks:       64167
  Small blocks:          52
  Holding blocks:        0
(see system documentation for 'mallinfo' for meaning)
--- qmem libfrr ---
Buffer                        :          6      24                 144
Buffer data                   :          1    4120                4120
Host config                   :          2  (variably sized)        48
Command Tokens                :       9807      72              708088
Command Token Text            :       7240  (variably sized)    249712
Command Token Help            :       7240  (variably sized)    175184
Command Argument              :          2  (variably sized)        48
Command Argument Name         :       1530  (variably sized)     37056
FRR POSIX Thread              :          4  (variably sized)       288
POSIX synchronization primitives:          4  (variably sized)       192
Graph                         :         35       8                 840
Graph Node                    :      11657      32              469288
Hash                          :       4300  (variably sized)    206512
Hash Bucket                   :      12762      32              511120
Hash Index                    :       2151  (variably sized)   4394808
Hook entry                    :          2      48                 112
Interface                     :        112     248               27792
Connected                     :         95      40                3800
Link List                     :        279      40               11240
Link Node                     :       2310      24               55520
Logging                       :          1      80                  88
Temporary memory              :         11  (variably sized)       312
Nexthop                       :          4     112                 480
Priority queue                :          3      32                 120
Priority queue data           :          3     256                 792
Prefix                        :         97      48                5432
Ring buffer                   :         12  (variably sized)    246096
Skip List                     :          2      56                 112
Skip Node                     :          4     160                 672
Socket union                  :          8      28                 320
Stream                        :         21  (variably sized)    148344
Stream FIFO                   :         12      64                 896
Route table                   :       1904      48              106672
Thread                        :         30     176                5520
Thread master                 :         11  (variably sized)     50184
Thread Poll Info              :          6    8192               49200
Thread stats                  :         24      64                1744
Vector                        :      23391      16              562792
Vector index                  :      23391  (variably sized)    741624
VRF                           :          1     184                 184
VRF bit-map                   :          3       8                  72
VTY                           :          6  (variably sized)     19440
Work queue                    :          3     144                 456
Work queue name string        :          3  (variably sized)        72
Zclient                       :          2    2984                5968
Redistribution instance IDs   :          6       2                 144
--- qmem bgpd ---
BGP instance                  :          2  (variably sized)      4832
BGP listen socket details     :          2      48                 112
BGP peer                      :          8   21008              168128
BGP peer hostname             :         11  (variably sized)       264
Peer group                    :          1      64                  72
BGP Peer group hostname       :          1       7                  24
BGP peer af                   :          8      80                 704
BGP update group              :          2     104                 208
BGP update subgroup           :          2     240                 496
BGP packet                    :          2      56                 112
BGP attribute                 :       3499     232              812888
BGP aspath                    :          1      40                  40
BGP aspath str                :          1       1                  24
BGP table                     :       1891      40               75672
BGP node                      :       6621     160             1112328
BGP route                     :       9378     112             1127616
BGP ancillary route info      :       2558     216              552528
BGP connected                 :          4       4                  96
BGP synchronise               :        128      72                9296
BGP adj out                   :         18      72                1296
extcommunity                  :        559      32               22376
extcommunity val              :        559  (variably sized)     13416
extcommunity str              :        545  (variably sized)     74120
community-list handler        :          1      96                 104
Cluster list                  :          1      24                  24
Cluster list val              :          1       4                  24
BGP nexthop                   :          4      72                 288
BGP own address               :          2       8                  48
BGP own tunnel-ip address     :          1       8                  24
BGP EVPN Information          :          7      96                 728
BGP EVPN Import RT            :          7      16                 168
BGP PBR Context               :          1      16                  24
BGP Label FIFO                :          1      48                  56
--- qmem rfapi ---
NVE Configuration             :          1    2648                2648
RFAPI Generic                 :          1     296                 296
RFAPI Import Table            :          1     208                 216

Memory statistics for watchfrr:
System allocator statistics:
  Total heap allocated:  264 KiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  211 KiB
  Free small blocks:     1376 bytes
  Free ordinary blocks:  53 KiB
  Ordinary blocks:       2
  Small blocks:          40
  Holding blocks:        0
(see system documentation for 'mallinfo' for meaning)
--- qmem libfrr ---
Buffer                        :          2      24                  48
Buffer data                   :          1    4120                4120
Host config                   :          1       6                  24
Command Tokens                :        369      72               26632
Command Token Text            :        303  (variably sized)      9960
Command Token Help            :        303  (variably sized)      7336
Command Argument              :          2  (variably sized)        48
Command Argument Name         :         38  (variably sized)       912
Graph                         :          6       8                 144
Graph Node                    :        468      32               18768
Hash                          :         18  (variably sized)       864
Hash Bucket                   :        126      32                5056
Hash Index                    :          9  (variably sized)      3912
Hook entry                    :          2      48                 112
Link List                     :          5      40                 200
Link Node                     :         12      24                 288
Logging                       :          1      80                  88
Temporary memory              :          5  (variably sized)       120
Priority queue                :          1      32                  40
Priority queue data           :          1     256                 264
Thread                        :          9     176                1656
Thread master                 :          3  (variably sized)     16712
Thread Poll Info              :          2    8192               16400
Thread stats                  :         10      64                 736
Vector                        :        953      16               22936
Vector index                  :        953  (variably sized)     31032
VTY                           :          3  (variably sized)      9720
--- qmem watchfrr ---
watchfrr daemon entry         :          3     136                 408

Memory statistics for staticd:
System allocator statistics:
  Total heap allocated:  660 KiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  640 KiB
  Free small blocks:     1536 bytes
  Free ordinary blocks:  20 KiB
  Ordinary blocks:       1
  Small blocks:          43
  Holding blocks:        0
(see system documentation for 'mallinfo' for meaning)
--- qmem libfrr ---
Buffer                        :          5      24                 120
Buffer data                   :          1    4120                4120
Host config                   :          2  (variably sized)        48
Command Tokens                :       1137      72               82088
Command Token Text            :        896  (variably sized)     28544
Command Token Help            :        896  (variably sized)     21616
Command Argument              :          2  (variably sized)        48
Command Argument Name         :        211  (variably sized)      5224
Graph                         :         11       8                 264
Graph Node                    :       1339      32               53768
Hash                          :        202  (variably sized)      9776
Hash Bucket                   :        340      32               14048
Hash Index                    :        101  (variably sized)     38248
Hook entry                    :          2      48                 112
Interface                     :        112     248               27792
Connected                     :         95      40                3896
Link List                     :        233      40                9544
Link Node                     :        203      24                4888
Logging                       :          1      80                  88
Temporary memory              :          6  (variably sized)       384
Priority queue                :          1      32                  40
Priority queue data           :          1     256                 264
Prefix                        :         97      48                5528
Stream                        :          2   16416               32848
Route table                   :          4      48                 256
Thread                        :          6     176                1104
Thread master                 :          3  (variably sized)     16712
Thread Poll Info              :          2    8192               16400
Thread stats                  :          6      64                 448
Vector                        :       2705      16               65096
Vector index                  :       2705  (variably sized)     87480
VRF                           :          1     184                 184
VRF bit-map                   :          3       8                  72
VTY                           :          6  (variably sized)     19440
Zclient                       :          1    2984                2984
Redistribution instance IDs   :          3       2                  72
--- qmem staticd ---

And after daemon restart:

bash-4.4# systemctl status frr
● frr.service - FRRouting
   Loaded: loaded (/etc/systemd/system/frr.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2019-03-13 15:45:50 CET; 11s ago
     Docs: https://frrouting.readthedocs.io/en/latest/setup.html
  Process: 42145 ExecStop=/usr/lib/frr/frrinit.sh stop (code=exited, status=0/SUCCESS)
  Process: 42175 ExecStart=/usr/lib/frr/frrinit.sh start (code=exited, status=0/SUCCESS)
    Tasks: 9 (limit: 11059)
   Memory: 20.4M
   CGroup: /system.slice/frr.service
           ├─42186 /usr/lib/frr/watchfrr -d zebra bgpd staticd
           ├─42202 /usr/lib/frr/zebra -d
           ├─42205 /usr/lib/frr/bgpd -d
           └─42212 /usr/lib/frr/staticd -d

netfreak98 avatar Mar 13 '19 14:03 netfreak98

bash-4.4# ps -o rss= -p `pidof bgpd` | awk '{print $1/1024/1024, "GB"}'
13.8068 GB

netfreak98 avatar Mar 13 '19 14:03 netfreak98

We are only using the BGP evpn extention:

vRouter#  show bgp l2vpn evpn summary 
BGP router identifier 10.42.64.38, local AS number 65000 vrf-id 0
BGP table version 0
RIB entries 3581, using 560 KiB of memory

netfreak98 avatar Mar 13 '19 14:03 netfreak98

Looks like we have a memleak somewhere. Can you provide your (redacted) configurations?

qlyoung avatar Mar 14 '19 15:03 qlyoung

Hi,

here we go:

frr version 6.0.2
frr defaults traditional
hostname vRouter
log syslog informational
no ip forwarding
no ipv6 forwarding
service integrated-vtysh-config
!
router bgp 65000
 bgp router-id 10.42.64.38
 coalesce-time 1000
 neighbor fabric peer-group
 neighbor fabric remote-as 65000
 neighbor fabric capability extended-nexthop
 neighbor 10.42.255.251 peer-group fabric
 neighbor 10.42.255.252 peer-group fabric
 neighbor 10.42.255.253 peer-group fabric
 neighbor 10.42.255.254 peer-group fabric
 !
 address-family l2vpn evpn
  neighbor fabric activate
  advertise-all-vni
 exit-address-family
!
line vty
!

netfreak98 avatar Mar 14 '19 15:03 netfreak98

@louberger have you seen anything like this in your tests? Do you test with EVPN enabled?

qlyoung avatar Mar 19 '19 15:03 qlyoung

Are there any updates?

netfreak98 avatar Apr 08 '19 10:04 netfreak98

6.0 and 7.0 both look unchanged in the labn ci. -which only covers bgp and bgp with l3vpns

louberger avatar Apr 08 '19 11:04 louberger

Are there any updates? Is there a new version available which fixes this problem? We recently switched to FRR version 7 from your deb repo but the problem still persists.

Thanks in advance!

netfreak98 avatar May 02 '19 07:05 netfreak98

bash-4.4# service frr status
● frr.service - FRRouting
   Loaded: loaded (/lib/systemd/system/frr.service; enabled; vendor preset: enab
   Active: active (running) since Thu 2019-05-02 02:50:02 CEST; 6h ago
     Docs: https://frrouting.readthedocs.io/en/latest/setup.html
  Process: 40684 ExecStop=/usr/lib/frr/frrinit.sh stop (code=exited, status=0/SU
  Process: 40712 ExecStart=/usr/lib/frr/frrinit.sh start (code=exited, status=0/
    Tasks: 10 (limit: 11059)
   Memory: 284.4M

Total number of neighbors 4


L2VPN EVPN Summary:
BGP router identifier 10.42.64.36, local AS number 65000 vrf-id 0
BGP table version 0
RIB entries 4885, using 763 KiB of memory
Peers 4, using 83 KiB of memory
Peer groups 1, using 64 bytes of memory

netfreak98 avatar May 02 '19 07:05 netfreak98

Hi,

this problem still persists in version 7.1.. can you please check? I don't like to restart FRR every week.. Only BGP for some peers, EVPN and OSPF is used..

root@edge02:/home/jkoenig# systemctl status frr.service
● frr.service - FRRouting
   Loaded: loaded (/lib/systemd/system/frr.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2019-09-23 11:04:45 CEST; 6 days ago
     Docs: https://frrouting.readthedocs.io/en/latest/setup.html
    Tasks: 15 (limit: 9830)
   Memory: 20.5G

netfreak98 avatar Oct 16 '19 07:10 netfreak98

root@edge01:/home/jkoenig# ps -o rss= -p `pidof bgpd` | awk '{print $1/1024/1024, "GB"}'
12.2273 GB
root@edge02:/home/jkoenig# ps -o rss= -p `pidof bgpd` | awk '{print $1/1024/1024, "GB"}'
19.8952 GB

netfreak98 avatar Oct 16 '19 07:10 netfreak98

@netfreak98 If you have the ability to recompile, give this patch a shot: jemalloc.txt . Make sure to use ./configure --with-jemalloc to enable the change.

We used to have issues with frr seemingly leaking memory that were solved by jemalloc - YMMV, I haven't tried without jemalloc in quite some time. We don't use debian, so I can't really offer any advice on getting this compiled w/debian.

devicenull avatar Oct 28 '19 15:10 devicenull

@frrbot autoclose in 1 week.

ton31337 avatar Apr 21 '22 20:04 ton31337

This issue will be automatically closed in the specified period unless there is further activity.

frrbot[bot] avatar Apr 21 '22 20:04 frrbot[bot]

Lately with FRR 8.2.2 i have been experiencing this, 8.1 was quite stable in memory usage.

aalmenar avatar May 10 '22 09:05 aalmenar

@aalmenar do you know how big the consumption was on which runtime of the service? and i would have a question how big would be the environment where this was running? We have in a construct more like 4000 servers to offer vxlan.

mrickl avatar Jul 06 '22 15:07 mrickl

Please share the configurations, and show memory at least (before / after). Without the configuration it's hard to guess where is the issue.

ton31337 avatar Aug 22 '22 12:08 ton31337