opentelemetry-cpp icon indicating copy to clipboard operation
opentelemetry-cpp copied to clipboard

SetTracerProvider with a new NoopProvider wrapped in a type-inferred nostd::shared_ptr causes segfaults in Release builds

Open johanpel opened this issue 3 years ago • 2 comments
trafficstars

Environment

GCC 9.4.0 Compiler flags: -O3 -DNDEBUG (CMake Release) or -O2 -g -DNDEBUG (CMake RelWithDebInfo) Default CMake options, linking only to opentelemetry_trace OpenTelemetry C++ 1.4.1

Steps to reproduce

#include <opentelemetry/trace/provider.h>

int main(int argc, char** argv) {
  opentelemetry::trace::Provider::SetTracerProvider(
      opentelemetry::nostd::shared_ptr(new opentelemetry::trace::NoopTracerProvider));
  auto t = opentelemetry::trace::Provider::GetTracerProvider()->GetTracer("test");
  t->StartSpan("test");
}

Expected behavior

No segfault.

Actual behavior

Results in a segfault with stacktrace:

std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release shared_ptr_base.h:148
std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release shared_ptr_base.h:148
std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count shared_ptr_base.h:730
std::__shared_ptr<opentelemetry::v1::trace::TracerProvider, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr shared_ptr_base.h:1169
std::shared_ptr<opentelemetry::v1::trace::TracerProvider>::~shared_ptr shared_ptr.h:103
opentelemetry::v1::nostd::shared_ptr<opentelemetry::v1::trace::TracerProvider>::shared_ptr_wrapper::~shared_ptr_wrapper shared_ptr.h:43
opentelemetry::v1::nostd::shared_ptr<opentelemetry::v1::trace::TracerProvider>::operator= shared_ptr.h:115
opentelemetry::v1::trace::Provider::SetTracerProvider provider.h:40
main noop_provider.cpp:4
__libc_start_call_main 0x00007ffff7b4fd90
__libc_start_main_impl 0x00007ffff7b4fe40
_start 0x00005555555567a5

Additional context

To fix this I eventually figured out explicitly specifying the nostd::shared_ptr template argument fixes the problem:

#include <opentelemetry/trace/provider.h>

int main(int argc, char** argv) {
  opentelemetry::trace::Provider::SetTracerProvider(
      opentelemetry::nostd::shared_ptr<opentelemetry::trace::TracerProvider>(
          new opentelemetry::trace::NoopTracerProvider));
  auto t = opentelemetry::trace::Provider::GetTracerProvider()->GetTracer("test");
  t->StartSpan("test");
}

In a larger system where we've been using the default SDK TracerProvider and after we try to reset it like this, there is actually a deadlock in SetTracerProvider instead of a segfault, for which I haven't been able to produce a minimal example yet, but these deadlocks also disappear with the above fix.

This bug does not occur when building only with -g (CMake Debug).

johanpel avatar Jun 30 '22 12:06 johanpel

This issue was marked as stale due to lack of activity.

github-actions[bot] avatar Sep 17 '22 02:09 github-actions[bot]

Is this problem still exist in the latest version?(1.6.0) In the early version, you need call Shutdown before destroy a provider.

owent avatar Sep 17 '22 15:09 owent

@johanpel - Can you please update if issue is resolved, and can be closed?

lalitb avatar Sep 26 '22 20:09 lalitb

My apologies for the delay. I have retried this with the same setup, except changed to OpenTelemetry C++ 1.6.1. Still getting that problem, this is my stack from running after compiling with -O2 -g -DNDEBUG:

std::_Sp_counted_ptr<opentelemetry::v1::trace::NoopTracerProvider *, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr provider.h:53
std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<opentelemetry::v1::trace::NoopTracerProvider *> shared_ptr_base.h:625
std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<opentelemetry::v1::trace::NoopTracerProvider *> shared_ptr_base.h:636
std::__shared_ptr<opentelemetry::v1::trace::NoopTracerProvider, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<opentelemetry::v1::trace::NoopTracerProvider, void> shared_ptr_base.h:1125
std::shared_ptr<opentelemetry::v1::trace::NoopTracerProvider>::shared_ptr<opentelemetry::v1::trace::NoopTracerProvider, void> shared_ptr.h:139
opentelemetry::v1::nostd::shared_ptr<opentelemetry::v1::trace::NoopTracerProvider>::shared_ptr shared_ptr.h:78
main noop_provider.cpp:5
__libc_start_call_main 0x00007ffff7b52d90
__libc_start_main_impl 0x00007ffff7b52e40
_start 0x0000555555556775

However, when I switch to GCC 11.2.0, the segfault is gone.

johanpel avatar Oct 07 '22 10:10 johanpel