Unable to get heap profile
Hi, I've been trying to get Envoy heap profile work, but with no luck so far. I've tested the followings:
- Using admin endpoint /heap_dump I tried this for 3 binaries: our in-house built Envoy, envoyproxy/envoy:v1.33-latest container and istio/proxyv2:1.25.1-debug . Yet the dump is just ~2KB in size and does not contain much information:
root@2dc295247198:/# ls envoy.heap -l
-rw-r--r-- 1 root root 2011 Mar 27 13:28 envoy.heap
root@2dc295247198:/# curl http://localhost:9901/memory
{
"allocated": "32911672",
"heap_size": "54525952",
"pageheap_unmapped": "0",
"pageheap_free": "4161536",
"total_thread_cache": "15860608",
"total_physical_bytes": "60297694"
}
root@2dc295247198:/# curl http://localhost:9901/heap_dump -o envoy.heap
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2094 0 2094 0 0 2384 0 --:--:-- --:--:-- --:--:-- 2382
root@2dc295247198:/# ls -l envoy.heap
-rw-r--r-- 1 root root 2094 Mar 27 13:29 envoy.heap
root@2dc295247198:/# go tool pprof /usr/local/bin/envoy envoy.heap
File: envoy
Type: space
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) text
Showing nodes accounting for 20.94MB, 100% of 20.94MB total
flat flat% sum% cum cum%
20.94MB 100% 100% 20.94MB 100% [envoy]
0 0% 100% 20.94MB 100% [libc.so.6]
(pprof)
- Use gperftools and tcmalloc I tried to build our in-house version with gperftools and tcmalloc: https://github.com/envoyproxy/envoy/blob/main/bazel/PPROF.md#collecting-the-profile , the build command is (tried some combinations of build options, from https://github.com/envoyproxy/envoy/pull/21160):
CC=clang CXX=clang++ /usr/local/bin/bazel build -c dbg --copt=-g --strip=never --linkopt=-Wl,--no-rosegment --extra_toolchains=@local_jdk//:all --cxxopt -D_GLIBCXX_USE_CXX11_ABI=1 --cxxopt -DENVOY_IGNORE_GLIBCXX_USE_CXX11_ABI_ERROR=1 --define tcmalloc=gperftools envoy
Launched Envoy with gperftools env variable:
HEAPPROFILE=/tmp/envoy.heap HEAPPROFILESIGNAL=12 envoy-static -c ~/envoy-min.yaml --concurrency 2 2>&1
I was able to trigger dumps in this way, but seems like pprof has trouble in locating symbols:
coder [ ~ ]$ env | grep 'PPROF_BINARY_PATH'
PPROF_BINARY_PATH=/home/coder/envoy-build/.bazel_envoy_cache/coder/da311d67ca475f55784fc7b1dd8a320c/execroot/envoy/bazel-out/k8-dbg/bin/source/exe/
coder [ ~ ]$ ls -l /tmp/envoy.heap.0057.heap
-rw-rw-r-- 1 coder coder 1048564 Mar 27 01:40 /tmp/envoy.heap.0057.heap
coder [ ~ ]$ go tool pprof -nodefraction=0 -nodecount=99999 /home/coder/envoy-build/.bazel_envoy_cache/coder/da311d67ca475f55784fc7b1dd8a320c/execroot/envoy/bazel-out/k8-dbg/bin/source/exe/envoy-static /tmp/envoy.heap.0057.heap
Some binary filenames not available. Symbolization may be incomplete.
Try setting PPROF_BINARY_PATH to the search path for local binaries.
File: envoy-static
Type: inuse_space
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) text
0 0% 100% 0.70kB 0.0063% DeleteHook
0 0% 100% 0.87kB 0.0078% DoAllocWithArena
0 0% 100% 13.44kB 0.12% MaybeDumpProfileLocked
0 0% 100% 1.11kB 0.01% NewHook
0 0% 100% 0.31kB 0.0028% RecordAlloc
0 0% 100% 4.54kB 0.041% __copy_move_a2
0 0% 100% 4kB 0.036% __equal_aux1
0 0% 100% 1.28kB 0.012% __memcmp
0 0% 100% 1.40kB 0.013% allocate_full_cpp_throw_oom
0 0% 100% 1.27kB 0.011% capture
0 0% 100% 4.90kB 0.044% copy
0 0% 100% 12kB 0.11% epoll_dispatch
0 0% 100% 1.50kB 0.014% epoll_init
0 0% 100% 23.75kB 0.21% event_add
0 0% 100% 24.75kB 0.22% event_add_nolock_
0 0% 100% 4255.92kB 38.35% event_base_loop
0 0% 100% 5.86kB 0.053% event_base_new
0 0% 100% 5.86kB 0.053% event_base_new_with_config
0 0% 100% 4337.35kB 39.09% event_persist_closure
0 0% 100% 4290.53kB 38.66% event_process_active
0 0% 100% 4299.90kB 38.75% event_process_active_single_queue
0 0% 100% 24.50kB 0.22% evmap_io_add_
0 0% 100% 9.25kB 0.083% evmap_make_space
0 0% 100% 0.25kB 0.0023% evmap_signal_add_
0 0% 100% 1kB 0.009% evthread_make_base_notifiable
0 0% 100% 1kB 0.009% evthread_make_base_notifiable_nolock_
0 0% 100% 2.19kB 0.02% invoke_hooks_and_free
(pprof)
This is slightly better than the first one, but still, important functions in Envoy's http stack is not shown. Seems like a symbol issue because only libevent functions are parsed correctly.
In these tests, I'm running ab at the background to produce load, with the following minimal envoy config:
admin:
access_log_path: /dev/null
address:
socket_address:
address: 127.0.0.1
port_value: 9901
static_resources: {}
Also tried to add listener and upstream to stress more code paths, the heap profile results are the same:
cat /tmp/envoy-more.yaml
domains: ["*"]
routes:
- match:
path: "/config_dump"
route:
cluster: admin_cluster
http_filters:
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
http_protocol_options:
accept_http_10: true
clusters:
- name: admin_cluster
connect_timeout: 0.25s
type: LOGICAL_DNS
dns_lookup_family: V4_ONLY
load_assignment:
cluster_name: admin_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 9901
Let me know if I'm missing something, we'd like to enable heap profiling in our production as well.
I will try to take a look this weekend.
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
Bump.
@wbpcode @gu0keno0 we've been wrestling with the same problem and haven't found a solution. Any updates?
Hi, I get some free time this weekend and have taken a look. Seems everything works fine. I rebuild one in my local dev container and bootstrap envoy with simple demo yaml.
Are you sure your binary contains the symbols or is the unstripped version?
# pprof ./bazel-bin/source/exe/envoy-static /tmp/envoy.heap
File: envoy-static
Build ID: e0e8c299558362ca0a9869b89f689f9f53952461
Type: space
Time: 2025-06-15 14:48:15 UTC
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) text
Showing nodes accounting for 6.10MB, 100% of 6.10MB total
Showing top 10 nodes out of 31
flat flat% sum% cum cum%
6.10MB 100% 100% 6.10MB 100% std::__1::basic_string::__append_default_init[abi:ne180100]
0 0% 100% 1.97MB 32.31% Envoy::Config::TypedFactory::configTypes
0 0% 100% 6.10MB 100% Envoy::MainCommon::MainCommon
0 0% 100% 6.10MB 100% Envoy::MainCommon::main
0 0% 100% 6.10MB 100% Envoy::MainCommonBase::MainCommonBase
0 0% 100% 4.13MB 67.69% Envoy::ProcessWide::ProcessWide
0 0% 100% 1.97MB 32.31% Envoy::Registry::FactoryRegistry::buildFactoriesByType
0 0% 100% 1.97MB 32.31% Envoy::Registry::FactoryRegistry::registeredTypes
0 0% 100% 1.97MB 32.31% Envoy::Registry::FactoryRegistryProxyImpl::registeredTypes
0 0% 100% 1.97MB 32.31% Envoy::Server::InstanceBase::initialize
(pprof)
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.