glusterfs icon indicating copy to clipboard operation
glusterfs copied to clipboard

High Memory Utilization at compute nodes of a cluster

Open anilkumarnaik opened this issue 2 years ago • 4 comments

Description of problem:

glusterfs is used at 8 servers cluster system. all the nodes have high memory utilization and not much load on the servers. htop and free command show glustefs process and almost 398gb of memory in use. any bug in glusterfs? or configuration issues?

[root@node4 ~]# ps -ef | grep glusterfs root 5275 1 0 Jan28 ? 00:02:08 /usr/sbin/glusterfsd -s node4 --volfile-id bak_dtb8.node4.exp6-gluster-back-dtb8 -p /var/run/gluster/vols/bak_dtb8/node4-exp6-gluster-back-dtb8.pid -S /var/run/gluster/5e93ec3752d772b8.socket --brick-name /exp6/gluster/back/dtb8 -l /var/log/glusterfs/bricks/exp6-gluster-back-dtb8.log --xlator-option *-posix.glusterd-uuid=a3268882-5c48-4c8e-ab41-a3187b22c3ff --process-name brick --brick-port 49152 --xlator-option bak_dtb8-server.listen-port=49152 root 5298 1 0 Jan28 ? 00:02:37 /usr/sbin/glusterfsd -s node4 --volfile-id dpr6_test.node4.exp6-gluster-home-dpr6_test -p /var/run/gluster/vols/dpr6_test/node4-exp6-gluster-home-dpr6_test.pid -S /var/run/gluster/3746764c78d1757e.socket --brick-name /exp6/gluster/home/dpr6_test -l /var/log/glusterfs/bricks/exp6-gluster-home-dpr6_test.log --xlator-option *-posix.glusterd-uuid=a3268882-5c48-4c8e-ab41-a3187b22c3ff --process-name brick --brick-port 49153 --xlator-option dpr6_test-server.listen-port=49153 root 5316 1 0 Jan28 ? 01:49:13 /usr/sbin/glusterfsd -s node4 --volfile-id gback.node4.data6-gback -p /var/run/gluster/vols/gback/node4-data6-gback.pid -S /var/run/gluster/23c898aa8b077858.socket --brick-name /data6/gback -l /var/log/glusterfs/bricks/data6-gback.log --xlator-option *-posix.glusterd-uuid=a3268882-5c48-4c8e-ab41-a3187b22c3ff --process-name brick --brick-port 49154 --xlator-option gback-server.listen-port=49154 root 5473 1 0 Jan28 ? 00:27:12 /usr/sbin/glusterfsd -s node4 --volfile-id ghome.node4.exp6-ghome -p /var/run/gluster/vols/ghome/node4-exp6-ghome.pid -S /var/run/gluster/7aed45550597470a.socket --brick-name /exp6/ghome -l /var/log/glusterfs/bricks/exp6-ghome.log --xlator-option *-posix.glusterd-uuid=a3268882-5c48-4c8e-ab41-a3187b22c3ff --process-name brick --brick-port 49155 --xlator-option ghome-server.listen-port=49155 root 5532 1 2 Jan28 ? 13:12:40 /usr/sbin/glusterfsd -s node4 --volfile-id home_dpr6.node4.exp6-gluster-home-dpr6 -p /var/run/gluster/vols/home_dpr6/node4-exp6-gluster-home-dpr6.pid -S /var/run/gluster/2d2faddfc9352b7f.socket --brick-name /exp6/gluster/home/dpr6 -l /var/log/glusterfs/bricks/exp6-gluster-home-dpr6.log --xlator-option *-posix.glusterd-uuid=a3268882-5c48-4c8e-ab41-a3187b22c3ff --process-name brick --brick-port 49156 --xlator-option home_dpr6-server.listen-port=49156 root 5554 1 0 Jan28 ? 02:04:44 /usr/sbin/glusterfsd -s node4 --volfile-id opt_dpr6.node4.exp6-gluster-opt-dpr6 -p /var/run/gluster/vols/opt_dpr6/node4-exp6-gluster-opt-dpr6.pid -S /var/run/gluster/258119f938d3a007.socket --brick-name /exp6/gluster/opt/dpr6 -l /var/log/glusterfs/bricks/exp6-gluster-opt-dpr6.log --xlator-option *-posix.glusterd-uuid=a3268882-5c48-4c8e-ab41-a3187b22c3ff --process-name brick --brick-port 49157 --xlator-option opt_dpr6-server.listen-port=49157 root 5599 1 0 Jan28 ? 00:03:33 /usr/sbin/glusterfs -s localhost --volfile-id shd/dpr6_test -p /var/run/gluster/shd/dpr6_test/dpr6_test-shd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/4a2cfda28c49d65d.socket --xlator-option replicate.node-uuid=a3268882-5c48-4c8e-ab41-a3187b22c3ff --process-name glustershd --client-pid=-6 root 17566 55086 0 16:03 pts/0 00:00:00 grep --color=auto glusterfs root 55193 1 0 15:55 ? 00:00:00 /usr/sbin/glusterfs --acl --negative-timeout=60 --process-name fuse --volfile-server=localhost --volfile-id=/ghome /home_vendor root 55315 1 0 15:55 ? 00:00:00 /usr/sbin/glusterfs --acl --negative-timeout=60 --process-name fuse --volfile-server=localhost --volfile-id=/opt_dpr6 /gopt

MemTotal: 528313124 kB MemFree: 179278824 kB MemAvailable: 180184636 kB Buffers: 3392 kB Cached: 870740 kB SwapCached: 0 kB Active: 1107796 kB Inactive: 728460 kB Active(anon): 963376 kB Inactive(anon): 33560 kB Active(file): 144420 kB Inactive(file): 694900 kB Unevictable: 448 kB Mlocked: 448 kB SwapTotal: 67107836 kB SwapFree: 67107836 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 962760 kB Mapped: 54812 kB Shmem: 34344 kB Slab: 167624436 kB SReclaimable: 2503044 kB SUnreclaim: 165121392 kB KernelStack: 26992 kB PageTables: 12116 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 331264396 kB Committed_AS: 5694024 kB VmallocTotal: 34359738367 kB VmallocUsed: 11735768 kB VmallocChunk: 34347233724 kB

The exact command to reproduce the issue:

The full output of the command that failed:

Expected results:

Mandatory info: - The output of the gluster volume info command: Volume Name: bak_dtb8 Type: Distribute Status: Started Snapshot Count: 0 Number of Bricks: 8 Transport-type: tcp Bricks: Brick1: master1:/exp1/gluster/back/dtb8 Brick2: master2:/exp2/gluster/back/dtb8 Brick3: node1:/exp3/gluster/back/dtb8 Brick4: node2:/exp4/gluster/back/dtb8 Brick5: node3:/exp5/gluster/back/dtb8 Brick6: node4:/exp6/gluster/back/dtb8 Brick7: node5:/exp7/gluster/back/dtb8 Brick8: node6:/exp8/gluster/back/dtb8 Options Reconfigured: nfs.disable: on transport.address-family: inet storage.fips-mode-rchecksum: on

- The output of the gluster volume status command: Status of volume: bak_dtb8 Gluster process TCP Port RDMA Port Online Pid

Brick master1:/exp1/gluster/back/dtb8 49152 0 Y 4136 Brick master2:/exp2/gluster/back/dtb8 49152 0 Y 4266 Brick node1:/exp3/gluster/back/dtb8 49152 0 Y 4605 Brick node2:/exp4/gluster/back/dtb8 49152 0 Y 5313 Brick node3:/exp5/gluster/back/dtb8 49152 0 Y 5318 Brick node4:/exp6/gluster/back/dtb8 49152 0 Y 5275 Brick node5:/exp7/gluster/back/dtb8 49152 0 Y 6796 Brick node6:/exp8/gluster/back/dtb8 49152 0 Y 7034

Task Status of Volume bak_dtb8

There are no active volume tasks

Additional info:

- The operating system / glusterfs version: Centos 7.9 Note: Please hide any confidential data which you don't want to share in public like IP address, file name, hostname or any other configuration

anilkumarnaik avatar Feb 16 '23 10:02 anilkumarnaik

Which process is consuming this memory? Could you provide statedump of glusterfs process that is consuming this memory?

pranithk avatar Feb 20 '23 06:02 pranithk

https://docs.gluster.org/en/main/Troubleshooting/statedump/ has steps to collect the data.

pranithk avatar Feb 20 '23 06:02 pranithk

Thank you for your contributions. Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity. It will be closed in 2 weeks if no one responds with a comment here.

stale[bot] avatar Oct 15 '23 13:10 stale[bot]

I have the same problem, and the satedump shows that : mark 43.74G** [mount/fuse.fuse - usage-type gf_common_mt_inode_ctx memusage] size=46967794112 num_allocs=112903389 max_size=46967794464 max_num_allocs=112903390 total_allocs=232308088 any suggestions?

harleyxu-xhl avatar Jan 26 '24 03:01 harleyxu-xhl