envoy Configurable heap memory release rate

Expose config option for tcmalloc [memory background release rate](https://github.com/google/tcmalloc/blob/bf4db7e4c85b142ecdd37f8bf388075131c387e9/tcmalloc/malloc_extension.h#L637C15-L637C39, that eases tuning of tcmalloc in Envoy. Gperf tcmalloc is not yet supported in this change, as gperf tcmalloc memory release does not function the same way as tcmalloc does and introduced test flakiness.

Commit Message: Additional Description: Risk Level: Testing: Unit tests Docs Changes: API docs Release Notes: Platform Specific Features:

Feb 26 '24 22:02 nezdolik

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/). envoyproxy/api-shepherds assignee is @mattklein123 CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

:cat:

Caused by: https://github.com/envoyproxy/envoy/pull/32589 was opened by nezdolik.

see: more, trace.

Feb 26 '24 22:02 repokitteh-read-only[bot]

@mattklein123 would you mind taking another pass for api check?

Mar 06 '24 22:03 nezdolik

/wait

for CI

Mar 26 '24 13:03 KBaichoo

/wait

for CI

Apr 02 '24 05:04 soulxu

Once/if this patch lands the ci in this PR will be fixed after merging main.

Apr 05 '24 11:04 nezdolik

/wait

Apr 05 '24 11:04 nezdolik

@all above issue is really critical and blocking us in production cases , can you quickly approve and provide fix asap? thanks

Apr 08 '24 05:04 vaibhavpatil123

+1 to proceeding with this, it is very useful.

Apr 12 '24 20:04 briansonnenberg

@KBaichoo since switching to gperf tcmalloc for coverage builds turned out to be non feasible due to current code structure, i took time to investigate the original test flakiness in integration tests for gper tcmalloc. Ported the problematic test to gperf tcmalloc unit test suit and debugged the execution to get better understanding of how memory release algorithm works in gperf. It operates on a span level granularity and for each memory release call (please release N bytes, which is translated by gperf to K tcmalloc pages, N <=K) gperf tcmalloc will attempt to find suitable spans (the ones that are free and have at least K pages in total) in free list and use it. It also tracks the extra amount of bytes released during previous release call (which is ~(spans size - requested bytes)) and if this amount exceeds the currently requested amount, no memory release will occur. Hence in integration tests for gperf tcmalloc we should not expect memory to be released every time the timer is fired for memory releasing, we should instead make sure that memory has been released at least once and that the amount of the memory released is >= the requested amount. There is another aspect of "flakiness" in gperf memory releasing, e.g. when span was attempted to be released via syscall (e.g. madvise()), gperf tcmalloc will not always bump the stats for unmapped bytes, under certain conditions it will instead increase the page heap free bytes. So in test we should verify that either unmapped bytes have increased or pageheap free bytes have increased. Regular (google) tcmalloc on the other hand is more predictable/stable in terms of tests outputs and is easier to test. I have fixed the test for original patch (that supported both tcmalloc and gperf tcmalloc) and ran it locally with large enough number of iterations to make sure the flakiness is gone. The fixed tests looks like this:

TEST_F(MemoryReleaseTest, ReleaseRateAboveZeroCustomIntervalMemoryReleased) {
  size_t initial_allocated_bytes = Stats::totalCurrentlyAllocated();
  auto a = std::make_unique<uint32_t[]>(40 * MB);
  auto b = std::make_unique<uint32_t[]>(40 * MB);
  if (Stats::totalCurrentlyAllocated() <= initial_allocated_bytes) {
    GTEST_SKIP() << "Skipping test, cannot measure memory usage precisely on this platform.";
  }
  #if defined(GPERFTOOLS_TCMALLOC)
  size_t initial_free_bytes = Stats::totalPageHeapFree();
  #endif
  auto initial_unmapped_bytes = Stats::totalPageHeapUnmapped();
  EXPECT_LOG_CONTAINS(
      "info",
      "Configured tcmalloc with background release rate: 16777216 bytes per 2000 milliseconds",
      initialiseAllocatorManager(16 * MB /*bytes per second*/, 2));
  EXPECT_EQ(16 * MB, AllocatorManagerPeer::bytesToRelease(*allocator_manager_));
  EXPECT_EQ(std::chrono::milliseconds(2000),
            AllocatorManagerPeer::memoryReleaseInterval(*allocator_manager_));
  a.reset();
  step(std::chrono::milliseconds(2000));
  Stats::dumpStatsToLog();
  b.reset();
  step(std::chrono::milliseconds(4000));
  Stats::dumpStatsToLog();
#if defined(TCMALLOC)
  TestUtility::waitForCounterGe(stats_, "memory_release_test.tcmalloc.released_by_timer", 2UL,
                                time_system_);
  auto final_released_bytes = Stats::totalPageHeapUnmapped();
  EXPECT_LT(initial_unmapped_bytes, final_released_bytes);
#elif defined(GPERFTOOLS_TCMALLOC)
  // Gperf tcmalloc releases memory on a span granularity. If requested amount of memory to release is less than or equal
  // to the extra amount of memory released during previous release request, no memory will be released for the current request.
  TestUtility::waitForCounterGe(stats_, "memory_release_test.tcmalloc.released_by_timer", 0UL,
                                time_system_);
  auto final_released_bytes = Stats::totalPageHeapUnmapped();
  auto final_free_bytes = Stats::totalPageHeapFree();
  EXPECT_TRUE((initial_unmapped_bytes < final_released_bytes) || (initial_free_bytes < final_free_bytes));
#else
  EXPECT_LE(initial_unmapped_bytes, final_released_bytes);
#endif

}

If we decide to support memory releasing for both tcmallocs in Envoy, I would like to be explicit in docs that gperf tcmalloc algorithm is sophisticated and users should not necessarily expect memory to be released each timer fire period. E.g. to avoid the situation when we will get lots of user questions where we need to explain why gperf tcmalloc has acted the specific way. Am ready to resubmit the original patch with fixed test and user docs.

May 04 '24 19:05 nezdolik

After discussing offline with @KBaichoo we agreed to proceed with only supporting Google tcmalloc for now due to multiple reasons:

Google tcmalloc is more commonly used compared to gperf tcmalloc and better maintained
There is no known prod use case for gperf tcmalloc in Envoy
Complexity of gperf allocation algorithm, e.g. hard to write deterministic tests for memory release and lack of good project documentation and support, e.g. https://github.com/gperftools/gperftools/issues/1382

May 08 '24 13:05 nezdolik

/retest

May 09 '24 21:05 nezdolik

/retest

May 09 '24 21:05 nezdolik

There is no known prod use case for gperf tcmalloc in Envoy

All Istio derivatives used gperftools until very recently actually - but I flipped it to tcmalloc for all new versions.

May 09 '24 21:05 kyessenov

CC @envoyproxy/coverage-shephards: FYI only for changes made to (test/per_file_coverage.sh). envoyproxy/coverage-shephards assignee is @alyssawilk

:cat:

Caused by: https://github.com/envoyproxy/envoy/pull/32589 was synchronize by nezdolik.

see: more, trace.

May 09 '24 21:05 repokitteh-read-only[bot]

@KBaichoo @alyssawilk this is ready for another review.

May 10 '24 15:05 nezdolik

/retest

May 20 '24 20:05 nezdolik

@alyssawilk yeap, I will update to throwing an error instead of crashing.

May 24 '24 21:05 nezdolik

/retest

May 28 '24 13:05 alyssawilk

@alyssawilk the test failure looks legit due to updated log message, i will update it soonish today (after all the meetings end)

May 28 '24 13:05 nezdolik

Which envoy release will this feature be merged? I found that from the latest v1.30 release the changes were reverted:

Jul 07 '24 21:07 anvo1115

@anvo1115 memory releasing for tcmalloc allocator (not gperf) is supported in 1.30, please take a look: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/bootstrap/v3/bootstrap.proto.html#envoy-v3-api-msg-config-bootstrap-v3-memoryallocatormanager

Jul 08 '24 09:07 nezdolik