opentelemetry-cpp icon indicating copy to clipboard operation
opentelemetry-cpp copied to clipboard

memory leak from otlp http exporter

Open farzinlize opened this issue 3 months ago • 9 comments

Describe your environment I'm using OPENTELEMETRY_VERSION "1.22.0"

Steps to reproduce the problem is that when application is finished, valgrind reports many memory leaks related to otlp http exporter (examples are below) and calling shutdown on TracerProvider dose not change that. the main function is just a unit-test that calls defined initial and terminate telemetry functions I wrote like this:

initial_telemetry_ut();
// do some work
terminate_telemetry_ut(); // for cleanup but it fails to release allocated memories completely

here is the summery of those two functions abow:

  • initial_telemetry_ut summery:
otlp_export::OtlpHttpExporterOptions options; options.url = "http://localhost:4318/v1/traces";
auto exporter = opentelemetry::exporter::otlp::OtlpHttpExporterFactory::Create(options);

auto processor = opentelemetry::sdk::trace::BatchSpanProcessorFactory::Create(std::move(exporter), {});

auto provider = opentelemetry::sdk::trace::TracerProviderFactory::Create(
            std::move(processor), resources, std::move(the_sampler)
        );

opentelemetry::nostd::shared_ptr<opentelemetry::trace::TracerProvider> api_provider = std::shared_ptr<opentelemetry::trace::TracerProvider>(
        provider.release()
    );
g_provider = api_provider; // g_provider is a global variable of type pentelemetry::nostd::shared_ptr<opentelemetry::trace::TracerProvider>

opentelemetry::trace::Provider::SetTracerProvider(api_provider);
  • terminate_telemetry_ut summery:
std::shared_ptr<otl_trace::TracerProvider> none;
trace_api::Provider::SetTracerProvider(none);

g_provider = nullptr; // the same global variable from above

finally here are some examples of errors in valgrind report:

...
==22678== 648 bytes in 9 blocks are still reachable in loss record 124 of 130
==22678==    at 0x4846FA3: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==22678==    by 0x4BC2DD2: google::protobuf::EncodedDescriptorDatabase::DescriptorIndex::AddSymbol(google::protobuf::stringpiece_internal::StringPiece) (in /usr/lib/x86_64-linux-gnu/libprotobuf.so.32.0.12)
==22678==    by 0x4BC43B1: google::protobuf::EncodedDescriptorDatabase::Add(void const*, int) (in /usr/lib/x86_64-linux-gnu/libprotobuf.so.32.0.12)
==22678==    by 0x4B66C76: google::protobuf::DescriptorPool::InternalAddGeneratedFile(void const*, int) (in /usr/lib/x86_64-linux-gnu/libprotobuf.so.32.0.12)
==22678==    by 0x4BDBB77: ??? (in /usr/lib/x86_64-linux-gnu/libprotobuf.so.32.0.12)
==22678==    by 0x4AE02AA: ??? (in /usr/lib/x86_64-linux-gnu/libprotobuf.so.32.0.12)
==22678==    by 0x400571E: call_init.part.0 (dl-init.c:74)
==22678==    by 0x4005823: call_init (dl-init.c:120)
==22678==    by 0x4005823: _dl_init (dl-init.c:121)
==22678==    by 0x401F59F: ??? (in /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
...
==22678== 22,000 bytes in 125 blocks are still reachable in loss record 129 of 130
==22678==    at 0x484D953: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==22678==    by 0x60E2823: asn1_array2tree (in /usr/lib/x86_64-linux-gnu/libtasn1.so.6.6.3)
==22678==    by 0x5472434: ??? (in /usr/lib/x86_64-linux-gnu/libgnutls.so.30.37.1)
==22678==    by 0x5438EDF: ??? (in /usr/lib/x86_64-linux-gnu/libgnutls.so.30.37.1)
==22678==    by 0x400571E: call_init.part.0 (dl-init.c:74)
==22678==    by 0x4005823: call_init (dl-init.c:120)
==22678==    by 0x4005823: _dl_init (dl-init.c:121)
==22678==    by 0x401F59F: ??? (in /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
==22678== 
==22678== 83,072 bytes in 472 blocks are still reachable in loss record 130 of 130
==22678==    at 0x484D953: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==22678==    by 0x60E2823: asn1_array2tree (in /usr/lib/x86_64-linux-gnu/libtasn1.so.6.6.3)
==22678==    by 0x5472339: ??? (in /usr/lib/x86_64-linux-gnu/libgnutls.so.30.37.1)
==22678==    by 0x5438EDF: ??? (in /usr/lib/x86_64-linux-gnu/libgnutls.so.30.37.1)
==22678==    by 0x400571E: call_init.part.0 (dl-init.c:74)
==22678==    by 0x4005823: call_init (dl-init.c:120)
==22678==    by 0x4005823: _dl_init (dl-init.c:121)
==22678==    by 0x401F59F: ??? (in /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
==22678== 
==22678== LEAK SUMMARY:
==22678==    definitely lost: 0 bytes in 0 blocks
==22678==    indirectly lost: 0 bytes in 0 blocks
==22678==      possibly lost: 0 bytes in 0 blocks
==22678==    still reachable: 120,567 bytes in 890 blocks
==22678==         suppressed: 0 bytes in 0 blocks
==22678== 
==22678== For lists of detected and suppressed errors, rerun with: -s
==22678== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

What is the expected behavior? I expect to see no leaks in valgrind result

Tip: React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

farzinlize avatar Sep 15 '25 10:09 farzinlize

I should mention that I tried to cast the g_provider global variable to TracerProvider to call ForceFlush and Shutdown on the provider and I got true value in return from calling those methods which indicates I was successfully shutdown the provider but the exact problems and leaks are reported on valgrind so I put them aside. even the size of leaked memories stays the same amount no matter what I have done.

farzinlize avatar Sep 15 '25 10:09 farzinlize

@farzinlize if you are implementing ThreadInstrumentation interface call google::protobuf::ShutdownProtobufLibrary() in OnEnd() or at the end of your main()

I had memory leaks from protobuf too and in my case that was enough.

gparlamas avatar Sep 15 '25 11:09 gparlamas

@gparlamas hey pal thank you for your query on the matter and thankfully, your point did eradicate errors originated from protobuf but there remains also another group of memory leak errors from libgnutls

farzinlize avatar Sep 16 '25 06:09 farzinlize

I should add the fact that I didn't initialize such libraries so I expect opentelemetry-cpp to terminate those dependencies so I also suspect that I'm not terminating opentelemetry library completely

farzinlize avatar Sep 16 '25 06:09 farzinlize

Protobuf's symbol pool may be used by many other components. So we can not shutdown it directly. Could you please call google::protobuf::ShutdownProtobufLibrary() after shutdown otel-cpp and other components which may use protobuf?

owent avatar Sep 16 '25 06:09 owent

@owent Yes that sounds good enough for me, although I would recommend something like a flag or option when such objects are creating to tell otel-cpp library to shutdown anything it depends on so me, the user, could choose if it wants to keep libraries such as protobuf or libgnutls open even after otel-cpp is terminated or not.

p.s: I tried calling gnutls_global_deinit(); and rest of the memory leak errors were also eliminated.

farzinlize avatar Sep 16 '25 07:09 farzinlize

ok eventually my problem is solved but now I must link gnutls and protobuf separately to my own code to be able to call shutdown or deinit functions of those libraries from my own program, where I only wanted to use otel-cpp alone. cmake commands are noted below:

# ----- link gnutls and protobuf for opentelemetry cleanup only -----
# enable pkg-config for gnutls
find_package(PkgConfig REQUIRED)
pkg_check_modules(GNUTLS REQUIRED IMPORTED_TARGET gnutls)
# protobuf
include(FindProtobuf)
find_package(Protobuf REQUIRED)

include_directories(${PROTOBUF_INCLUDE_DIR})
link_libraries(${PROTOBUF_LIBRARY} PkgConfig::GNUTLS)

these are functions I called to eradicate memory leak problems at the end of my terminate_telemetry_ut

google::protobuf::ShutdownProtobufLibrary();
gnutls_global_deinit();

new headers I added to my code:

#include <google/protobuf/stubs/common.h>
#include <gnutls/gnutls.h>

farzinlize avatar Sep 16 '25 07:09 farzinlize

Regarding protobuf, the only thing which is deleted with ShutdownProtobufLibrary() is the ShutdownData, which is allocated only when any of the OnShutdown*() functions is called (see protobuf/message_lite.cc). I wonder who calls any of this functions? Didn't find it in opentelemetry-cpp. Anyway, this is not the shown memory leak. It shows something on the PHP side.

Reneg973 avatar Oct 28 '25 02:10 Reneg973

Regarding protobuf, the only thing which is deleted with ShutdownProtobufLibrary() is the ShutdownData, which is allocated only when any of the OnShutdown*() functions is called (see protobuf/message_lite.cc). I wonder who calls any of this functions? Didn't find it in opentelemetry-cpp. Anyway, this is not the shown memory leak. It shows something on the PHP side.

It should be called by app after all components depend protobuf are shutdown. otel-cpp should not call it because it may not be the last component in a app which depends protobuf.

owent avatar Oct 29 '25 11:10 owent