opentelemetry-cpp icon indicating copy to clipboard operation
opentelemetry-cpp copied to clipboard

thread_local TracerProvider with BatchSpanProcessorFactory causes thread to hang

Open Haydoggo opened this issue 9 months ago • 6 comments

Describe your environment OpenTelemetry-cpp 1.19.0 ISO C++ 14, Legacy MSVC.

Steps to reproduce create a thread_local unique_ptr to a TracerProvider that uses a BatchSpanProcessor, and have the thread end without explicitly freeing the provider.

MRP:

#include <iostream>
#include <thread>
#include <opentelemetry/exporters/ostream/span_exporter_factory.h>
#include <opentelemetry/sdk/trace/batch_span_processor_factory.h>
#include <opentelemetry/sdk/trace/tracer_provider_factory.h>

namespace trace = opentelemetry::sdk::trace;
thread_local std::unique_ptr<trace::TracerProvider> provider;

void workerThread() {
    auto exporter = opentelemetry::exporter::trace::OStreamSpanExporterFactory::Create(std::cout);
    trace::BatchSpanProcessorOptions processorOptions;
    auto processor = trace::BatchSpanProcessorFactory::Create(std::move(exporter), processorOptions);
    provider = trace::TracerProviderFactory::Create(std::move(processor));
    std::cout << "Provider created\n";
}

int main()
{
    auto thread = std::thread(workerThread);
    thread.join();
    std::cout << "Thread joined\n"; //Execution never reaches this point
}

What is the expected behavior? The thread should cleanly shutdown the tracer provider and batch span processor on exit.

What is the actual behavior? The thread appears to deadlock during the implicit BatchSpanProcessor::Shutdown call.

Additional context Trying to add some telemetry gathering to an application via a dll wrapping opentelemetry-cpp. The nature of the application means that it uses multiple threads that functionally act as their own processes, so each thread has its own TracerProvider.

It's possible for these threads to end without notice being given to the dll, so we don't get a chance to cleanup the TracerProvider (ie. call provider.reset() in the above code)

Haydoggo avatar Mar 24 '25 01:03 Haydoggo

Hi @marcalff, I see you've marked this issue as "needs more information". I'd be happy to supply more info, just let me know what you need :)

Haydoggo avatar Mar 25 '25 22:03 Haydoggo

Why thread_local is needed here? What is it trying to solve?

malkia avatar Mar 28 '25 19:03 malkia

I don't have access to the logic that starts or stops threads in the application I'm building an extension dll for, so I use thread_local to prevent leaking TracerProviders and their exporters and processors etc.

In my context, a thread represents an independent application, so I'm using a TracerProvider per thread to capture application level metadata such as "application name", "user", etc, without having to append these immutable data to the metadata of every span.

In any case, I thought it might be useful to report this unstable behaviour.

Haydoggo avatar May 11 '25 23:05 Haydoggo

I'm just wondering what OpenTelemetry's own background threads see in this case?

Let's say you need provider per a thread of your own (let's call these threads "worker1", "worker2", ...). How do you communicate to OpenTelemetry's own threads that this is the provider they have to use when dealing with traces coming from "worker1" - what it should see?

If OpenTelemetry was always running as part of your thread, then this would work, but that's not the case in general (though there might be configurations where this could be the case, but not sure). Something gotta work on the back....

malkia avatar May 12 '25 01:05 malkia

As each of my "applications" run on seperate threads, they each get their own independent thread_local provider, so any spans started on a given thread are started by a tracer belonging to that thread's local provider. As such, I don't have any central controller of providers that needs to decide which provider is used for any trace.

My understanding is that each provider owns its own means for processing and exporting traces, so I'm not sure what you mean by OpenTelemetry having to run in the background?

Haydoggo avatar May 28 '25 03:05 Haydoggo

To investigate, see if related to #3448. See also metrics processor.

marcalff avatar Jun 02 '25 20:06 marcalff