glusterfs icon indicating copy to clipboard operation
glusterfs copied to clipboard

iobuf arenas: remove arenas that are smaller than 128K - they just take up memory

Open mykaul opened this issue 3 years ago • 13 comments

Following https://github.com/gluster/glusterfs/commit/491e9ec38bbfedf3caed52535b718efdc2dd6117, which changed the way we use arenas - we now use them only for requests that are larger than 128K, we should remove the smaller areans.

  1. We probably need some more arenas between 128K and the next one - 256K.
  2. I'm unsure if just 2 arenas of 1MB make sense.
  3. There's a significant hit after this change to debug build, due to memory invalidation - a quick hack was to use the smaller arenas, and this won't be feasible after this change...

mykaul avatar Oct 13 '21 12:10 mykaul

Quick fix:

diff --git a/libglusterfs/src/iobuf.c b/libglusterfs/src/iobuf.c
index 445d6cc8c4..06174bfe3e 100644
--- a/libglusterfs/src/iobuf.c
+++ b/libglusterfs/src/iobuf.c
@@ -23,8 +23,10 @@
 /* Make sure this array is sorted based on pagesize */
 static const struct iobuf_init_config gf_iobuf_init_config[] = {
     /* { pagesize, num_pages }, */
-    {128, 1024},     {512, 512},       {2 * 1024, 512}, {8 * 1024, 128},
-    {32 * 1024, 64}, {128 * 1024, 32}, {256 * 1024, 8}, {1 * 1024 * 1024, 2},
+    {128 * 1024, 64},
+    {133, 64},
+    {256 * 1024, 8},
+    {1 * 1024 * 1024, 2},
 };
 
 static int
@@ -560,7 +562,7 @@ iobuf_get2(struct iobuf_pool *iobuf_pool, size_t page_size)
        page size is less than equal to 128KB, the data is available
        on the link https://github.com/gluster/glusterfs/issues/2771
     */
-    if (page_size <= USE_IOBUF_POOL_IF_SIZE_GREATER_THAN) {
+    if (page_size < USE_IOBUF_POOL_IF_SIZE_GREATER_THAN) {
         iobuf = iobuf_get_from_small(page_size);
         if (!iobuf)
             gf_smsg(THIS->name, GF_LOG_WARNING, 0, LG_MSG_IOBUF_NOT_FOUND,

mykaul avatar Oct 26 '21 07:10 mykaul

Quick fix:

diff --git a/libglusterfs/src/iobuf.c b/libglusterfs/src/iobuf.c
index 445d6cc8c4..06174bfe3e 100644
--- a/libglusterfs/src/iobuf.c
+++ b/libglusterfs/src/iobuf.c
@@ -23,8 +23,10 @@
 /* Make sure this array is sorted based on pagesize */
 static const struct iobuf_init_config gf_iobuf_init_config[] = {
     /* { pagesize, num_pages }, */
-    {128, 1024},     {512, 512},       {2 * 1024, 512}, {8 * 1024, 128},
-    {32 * 1024, 64}, {128 * 1024, 32}, {256 * 1024, 8}, {1 * 1024 * 1024, 2},
+    {128 * 1024, 64},
+    {133, 64},
+    {256 * 1024, 8},
+    {1 * 1024 * 1024, 2},
 };
 
 static int
@@ -560,7 +562,7 @@ iobuf_get2(struct iobuf_pool *iobuf_pool, size_t page_size)
        page size is less than equal to 128KB, the data is available
        on the link https://github.com/gluster/glusterfs/issues/2771
     */
-    if (page_size <= USE_IOBUF_POOL_IF_SIZE_GREATER_THAN) {
+    if (page_size < USE_IOBUF_POOL_IF_SIZE_GREATER_THAN) {
         iobuf = iobuf_get_from_small(page_size);
         if (!iobuf)
             gf_smsg(THIS->name, GF_LOG_WARNING, 0, LG_MSG_IOBUF_NOT_FOUND,

Why do we want to ignore standard allocation for 128k ?

mohit84 avatar Oct 26 '21 08:10 mohit84

128K happens to be the default for all those that do not use _get2() and therefore is therefore quite frequent allocation request.

  • The impact (with debug enabled) due to the memory invalidation is fairly large.
  • It is also unclear to me tcmalloc has a real advantage with 128K over statically pre-allocated 128K iobufs (notice that I have pre-allocated 64 of them - and added another 64 of those 133K - which fit 128K + 4K + 140 bytes - also a popular choice here and there! 64 of course is not just a magic number - I'd like to at some point try out to convert those lists to a 64bit bitmap array. I've also seen on my laptop (NVMe) that it's reasonable number)

mykaul avatar Oct 26 '21 08:10 mykaul

128K happens to be the default for all those that do not use _get2() and therefore is therefore quite frequent allocation request.

  • The impact (with debug enabled) due to the memory invalidation is fairly large.
  • It is also unclear to me tcmalloc has a real advantage with 128K over statically pre-allocated 128K iobufs (notice that I have pre-allocated 64 of them - and added another 64 of those 133K - which fit 128K + 4K + 140 bytes - also a popular choice here and there! 64 of course is not just a magic number - I'd like to at some point try out to convert those lists to a 64bit bitmap array. I've also seen on my laptop (NVMe) that it's reasonable number)

I think we need to test it, afaik current iobuf approach is not perfect we are using single list may be after using bitmap array the performance will not hurt.

mohit84 avatar Oct 26 '21 11:10 mohit84

I have the bitmap code somewhere, I can resurrect it once you are done with your outstanding PRs. I'm also trying to think of a clever way to use a bitmap smaller than 64 slots (for the 1MB - or perhaps we don't need that 1MB anyway?), without keeping a counter. I could probably initialize it with fewer slots available or something.

mykaul avatar Oct 26 '21 11:10 mykaul

Ack !!

mohit84 avatar Oct 26 '21 11:10 mohit84

Thank you for your contributions. Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity. It will be closed in 2 weeks if no one responds with a comment here.

stale[bot] avatar Jun 04 '22 01:06 stale[bot]

not sure how to unstale it, but I think it's still an easy one I'd like to tackle.

mykaul avatar Jun 09 '22 13:06 mykaul

IIRC earlier we were getting crash in SSL api.

mohit84 avatar Jun 13 '22 01:06 mohit84

IIRC earlier we were getting crash in SSL api.

I thought we've fixed it by moving to tcmalloc minimal?

  • It can't hurt to remove those iobuf sizes that are <= 128K, since I assume they are not utilized
  • I'd like to see if we can remove the standard allocation arena - and just call the regular allocation
  • I personally think the only useful arena is 136K - 128K + 4K + some... This is going later. I'll try to split to multiple patches and see where we are at.

mykaul avatar Jun 13 '22 16:06 mykaul

Thank you for your contributions. Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity. It will be closed in 2 weeks if no one responds with a comment here.

stale[bot] avatar Jan 17 '23 04:01 stale[bot]

I still think it's memory completely unused. Unsure why my PR is not getting through - there's a crash in glfs I can't seem to reproduce.

mykaul avatar Jan 17 '23 06:01 mykaul

Thank you for your contributions. Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity. It will be closed in 2 weeks if no one responds with a comment here.

stale[bot] avatar Sep 17 '23 06:09 stale[bot]