skupper icon indicating copy to clipboard operation
skupper copied to clipboard

Memory usage too high

Open wuwo1952368901 opened this issue 2 years ago • 11 comments

Skrouterd memory usage exceeds 50%.Total memory 32G.

wuwo1952368901 avatar Apr 17 '23 09:04 wuwo1952368901

Can you describe your setup? E.g. how many sites and services?

grs avatar Apr 17 '23 09:04 grs

skupper service status

Services exposed through Skupper: ├─ service-cluster1 (tcp port 30400) ├─ service-cluster2 (tcp port 30400) │ ?─ Targets: │ ?─ app=service-cluster2 name=service-cluster2 ?─ service-cluster3 (tcp port 30400)

wuwo1952368901 avatar Apr 17 '23 09:04 wuwo1952368901

Can you describe your setup? E.g. how many sites and services?

Is that what you want to know?

skupper service status

Services exposed through Skupper: ├─ service-cluster1 (tcp port 30400) ├─ service-cluster2 (tcp port 30400) │ ?─ Targets: │ ?─ app=service-cluster2 name=service-cluster2 ?─ service-cluster3 (tcp port 30400)

wuwo1952368901 avatar Apr 19 '23 03:04 wuwo1952368901

Yes, thank you. Can you also please run kubectl exec -it <name-of-router-pod> -- skstat -m?

grs avatar Apr 19 '23 07:04 grs

Yes, thank you. Can you also please run kubectl exec -it <name-of-router-pod> -- skstat -m?

Memory Pools
  type                        size   batch  thread-max  total       in-threads  rebal-in     rebal-out
  
 
 
 
 ========================================================================================================
  endpoint_ref_t              56     64     128         64          64          0            0
  qd_bitmask_t                24     64     128         1,152       1,024       481,902      481,904
  qd_buffer_t                 536    64     128         4,736       2,304       41,562,434   41,562,472
  qd_composed_field_t         48     64     128         384         384         0            0
  qd_composite_t              112    64     128         384         384         0            0
  qd_connection_t             2,504  16     32          128         96          253          255
  qd_connector_t              456    64     128         64          64          0            0
  qd_deferred_call_t          32     64     128         384         256         2,855        2,857
  qd_delivery_state_t         48     64     128         576         320         3,378,028    3,378,032
  qd_hash_handle_t            16     64     128         448         448         0            0
  qd_hash_item_t              40     64     128         448         448         0            0
  qd_iterator_t               128    64     128         1,984       1,920       353,213      353,214
  qd_link_ref_t               24     64     128         1,536       1,472       1,019        1,020
  qd_link_t                   136    64     128         1,600       1,600       8            8
  qd_listener_t               352    64     128         64          64          0            0
  qd_log_entry_t              2,112  16     32          1,072       1,072       0            0
  qd_management_context_t     56     64     128         256         256         0            0
  qd_message_content_t        1,080  64     128         1,024       832         25,894       25,897
  qd_message_t                88     64     128         1,728       1,600       712,559      712,561
  qd_node_t                   56     64     128         64          64          0            0
  qd_parse_node_t             104    64     128         128         128         0            0
  qd_parse_tree_t             32     64     128         64          64          0            0
  qd_parsed_field_t           136    64     128         2,560       2,432       352,255      352,257
  qd_session_t                48     64     128         320         320         0            0
  qd_timer_t                  112    64     128         64          64          0            0
  qdr_action_t                136    64     128         896         448         363,995,698  363,995,705
  qdr_addr_endpoint_state_t   48     64     128         64          64          0            0
  qdr_address_config_t        72     64     128         64          64          0            0
  qdr_address_t               384    64     128         256         256         0            0
  qdr_address_watch_t         72     64     128         64          64          0            0
  qdr_connection_info_t       144    64     128         256         256         0            0
  qdr_connection_ref_t        24     64     128         64          64          0            0
  qdr_connection_t            560    64     128         256         256         0            0
  qdr_connection_work_t       56     64     128         512         384         32           34
  qdr_core_timer_t            40     64     128         64          64          0            0
  qdr_delivery_cleanup_t      32     64     128         704         448         1,451,976    1,451,980
  qdr_delivery_ref_t          24     64     128         704         320         116,276,387  116,276,393
  qdr_delivery_t              328    64     128         1,536       1,536       906,962      906,962
  qdr_error_t                 24     64     128         64          64          0            0
  qdr_field_t                 32     64     128         448         320         288,885      288,887
  qdr_forward_deliver_info_t  32     64     128         64          64          0            0
  qdr_general_work_t          160    64     128         576         320         1,743,411    1,743,415
  qdr_link_ref_t              24     64     128         2,304       1,920       237,999,002  237,999,008
  qdr_link_t                  504    64     128         1,728       1,728       11           11
  qdr_link_work_t             48     64     128         1,152       1,088       1,436,527    1,436,528
  qdr_node_t                  88     64     128         64          64          0            0
  qdr_query_t                 344    64     128         320         320         0            0
  qdr_terminus_t              64     64     128         384         384         4            4
  qdrc_endpoint_t             24     64     128         64          64          0            0
  qdtm_router_t               16     64     128         192         192         0            0
  vflow_attribute_data_t      32     64     128         68,977,536  68,977,536  0            0
  vflow_record_t              160    64     128         13,795,584  13,795,584  0            0
  vflow_work_t                56     64     128         448         320         1,132,343    1,132,345

Memory Summary
  VmSize    Pooled
      ====================
  18.0 GiB  4.12 GiB

wuwo1952368901 avatar Apr 19 '23 08:04 wuwo1952368901

@kgiusti @ganeshmurthy any thoughts on this?

grs avatar Apr 19 '23 08:04 grs

There is a known vflow leak which has been fixed by @ted-ross - https://github.com/skupperproject/skupper-router/commit/11746972cbbdabe8fa1c813d8d2c29983777d2d8

Looking at the output of skstat -m, it looks like the same leak fixed by the above commit.

The commit is available in the 1.4.0-rc1 release of skupper (2.4.0-rc1 release of skupper-router)

This leak is caused when the router is misconfigured to have connectors in either direction which is not necessary. Having only one connector from one router to the other and removing the other connector in the opposite direction should fix this issue without having to apply the above fix.

ganeshmurthy avatar Apr 19 '23 12:04 ganeshmurthy

Memory not released https://github.com/skupperproject/skupper/issues/1470

wuwo1952368901 avatar May 20 '24 01:05 wuwo1952368901

skupper service status:

Services exposed through Skupper: ├─ cluster-1-0:port (tcp) ├─ cluster-1-1:port (tcp) ├─ cluster-1-2:port (tcp) ├─ cluster-1-3:port (tcp) ├─ cluster-1-4:port (tcp) ├─ cluster-1-5:port (tcp) ├─ cluster-1-6:port (tcp) ├─ cluster-1-7:port (tcp) ├─ cluster-2-0:port (tcp) ├─ cluster-2-1:port (tcp) ├─ cluster-2-2:port (tcp) ├─ cluster-2-3:port (tcp) ├─ cluster-2-4:port (tcp) ├─ cluster-2-5:port (tcp) ├─ cluster-2-6:port (tcp) ├─ cluster-2-7:port (tcp) ├─ cluster-3-0:port (tcp) ├─ cluster-3-1:port (tcp) ├─ cluster-3-2:port (tcp) ├─ cluster-3-3:port (tcp) ├─ cluster-3-4:port (tcp) ├─ cluster-3-5:port (tcp) ├─ cluster-3-6:port (tcp) ╰─ cluster-3-7:port (tcp)

kubectl exec -it -- skstat -m :

Defaulted container "router" out of: router, config-sync
2024-05-20 01:13:59.815665 UTC
skupper-router-c97944bdd-t2wsl

Memory Pools
  type                        size   batch  thread-max  total    in-threads  rebal-in     rebal-out
  =====================================================================================================
  endpoint_ref_t              56     64     128         64       64          0            0
  qd_bitmask_t                24     64     128         39,104   1,856       1,863,261    1,863,843
  qd_buffer_t                 4,096  16     32          43,280   3,520       123,641,351  123,643,836
  qd_composed_field_t         48     64     128         448      448         0            0
  qd_composite_t              112    64     128         448      448         0            0
  qd_connection_t             2,664  16     32          368      368         0            0
  qd_connector_t              752    64     128         64       64          0            0
  qd_deferred_call_t          32     64     128         384      256         95           97
  qd_delivery_state_t         48     64     128         6,464    320         6,986,645    6,986,741
  qd_hash_handle_t            16     64     128         576      576         0            0
  qd_hash_item_t              40     64     128         576      576         0            0
  qd_iterator_t               128    64     128         114,752  3,328       548,692      550,433
  qd_link_ref_t               24     64     128         76,608   2,496       2,188        3,346
  qd_link_t                   136    64     128         8,896    5,184       74           132
  qd_listener_t               360    64     128         64       64          0            0
  qd_log_entry_t              2,112  16     32          1,072    1,072       0            0
  qd_management_context_t     56     64     128         192      192         0            0
  qd_message_content_t        1,408  64     128         38,592   1,472       61,745       62,325
  qd_message_t                104    64     128         77,120   2,752       3,383,903    3,385,065
  qd_node_t                   56     64     128         64       64          0            0
  qd_parse_node_t             104    64     128         128      128         0            0
  qd_parse_tree_t             24     64     128         64       64          0            0
  qd_parsed_field_t           136    64     128         190,656  4,992       960,830      963,731
  qd_pn_free_link_session_t   32     64     128         256      256         0            0
  qd_session_t                48     64     128         640      640         0            0
  qd_timer_t                  112    64     128         128      128         0            0
  qdr_action_t                136    64     128         7,040    448         620,619,567  620,619,670
  qdr_addr_endpoint_state_t   48     64     128         64       64          0            0
  qdr_address_config_t        72     64     128         64       64          0            0
  qdr_address_t               392    64     128         512      512         0            0
  qdr_address_watch_t         72     64     128         64       64          0            0
  qdr_connection_info_t       144    64     128         512      512         0            0
  qdr_connection_ref_t        24     64     128         64       64          0            0
  qdr_connection_t            584    64     128         512      512         0            0
  qdr_connection_work_t       56     64     128         576      384         251          254
  qdr_core_timer_t            40     64     128         64       64          0            0
  qdr_delivery_cleanup_t      32     64     128         7,424    448         5,447,015    5,447,124
  qdr_delivery_ref_t          24     64     128         5,888    448         196,485,353  196,485,438
  qdr_delivery_t              400    64     128         77,056   2,496       2,124,613    2,125,778
  qdr_error_t                 24     64     128         64       64          0            0
  qdr_field_t                 32     64     128         448      320         152,627      152,629
  qdr_forward_deliver_info_t  32     64     128         64       64          0            0
  qdr_general_work_t          160    64     128         576      384         3,962,661    3,962,664
  qdr_link_ref_t              24     64     128         10,304   5,760       404,434,697  404,434,768
  qdr_link_t                  528    64     128         8,960    5,056       94           155
  qdr_link_work_t             48     64     128         4,224    1,536       5,015,716    5,015,758
  qdr_node_t                  88     64     128         64       64          0            0
  qdr_query_t                 344    64     128         256      256         0            0
  qdr_terminus_t              64     64     128         512      320         44           47
  qdrc_endpoint_t             24     64     128         128      128         0            0
  qdtm_router_t               16     64     128         64       64          0            0
  vflow_attribute_data_t      32     64     128         128      128         0            0
  vflow_record_t              160    64     128         384      384         0            0
  vflow_work_t                56     64     128         448      320         345,525      345,527

Memory Summary
  VmSize    RSS       Pooled
  =============================
  3.09 GiB  2.65 GiB  311 MiB

wuwo1952368901 avatar May 20 '24 01:05 wuwo1952368901

@grs @ganeshmurthy @fgiorgetti

wuwo1952368901 avatar May 23 '24 09:05 wuwo1952368901

Please see this comment - https://github.com/skupperproject/skupper/issues/1470#issuecomment-2135177583

ganeshmurthy avatar May 28 '24 13:05 ganeshmurthy