opentelemetry-cpp icon indicating copy to clipboard operation
opentelemetry-cpp copied to clipboard

[SDK] BatchLogRecordProcessor::ForceFlush hangs for 10 seconds

Open ggaudeau opened this issue 11 months ago • 6 comments

Platform : Ubuntu 23.10 Build: opentelemetry 1.11.0 with WITH_ASYNC_EXPORT_PREVIEW enabled Deps: curl 7.85.0

Here are the lines of code used to initialize the logger provider:

opentelemetry::exporter::otlp::OtlpHttpLogRecordExporterOptions opts;

auto exporter =
    opentelemetry::exporter::otlp::OtlpHttpLogRecordExporterFactory::Create(
        opts);

opentelemetry::sdk::logs::BatchLogRecordProcessorOptions batchOptions;

auto processor =
    opentelemetry::sdk::logs::BatchLogRecordProcessorFactory::Create(
        std::move(exporter),
        batchOptions);

auto resource = opentelemetry::sdk::resource::Resource::Create(attributes);

std::shared_ptr<opentelemetry::logs::LoggerProvider> provider =
    opentelemetry::sdk::logs::LoggerProviderFactory::Create(
        std::move(processor),
        resource);

opentelemetry::logs::Provider::SetLoggerProvider(provider);

Before quitting my application I do a ForceFlush with a timeout equal to 1 second.

auto loggerProvider =
            reinterpret_cast<opentelemetry::sdk::logs::LoggerProvider*>(
                provider);

loggerProvider->ForceFlush(std::chrono::seconds(1));

std::shared_ptr<opentelemetry::logs::LoggerProvider> none;
opentelemetry::logs::Provider::SetLoggerProvider(none);

When the network connection is not good, it can happen that my application hangs for 10 seconds, which corresponds to the timeout of the curl request which I don't want to change.

I'd like to know why the timeout is not taken into account.

ggaudeau avatar Mar 08 '24 14:03 ggaudeau

ForceFlush will wait for 1 second, but opentelemetry::logs::Provider::SetLoggerProvider(none) will make internal exporters shutdown can wait for all pending requests to finish. Could you please try to call loggerProvider->Shutdown(std::chrono::seconds(1)); instead of loggerProvider->ForceFlush(std::chrono::seconds(1));. This function will try to cancel all pending requests when got progress callback called by libcurl.

owent avatar Mar 10 '24 06:03 owent

Being in asynchronous mode, the Shutdown will cancel the sending of the last logs, which I'd like to avoid.

image

To be more accurate, when I set a one-second timeout and there are network problems, the app stays five seconds at line 119 and almost five seconds again in block 129-138

Basically, I switched to asynchronous mode because the processor was stuck in the Export method when I asked it to ForceFlush or Shutdown

ggaudeau avatar Mar 11 '24 10:03 ggaudeau

I am using Charles application to simulate throttling.

ggaudeau avatar Mar 11 '24 10:03 ggaudeau

SIG meeting 11.03.2024: need a decision on which timeout should be used for shutdown, and if we need to flush the last buffer. would be great to check the implementation in the other repos and have a consistent solution as other repos.

esigo avatar Mar 11 '24 20:03 esigo

This issue was marked as stale due to lack of activity.

github-actions[bot] avatar May 11 '24 01:05 github-actions[bot]

#2584 may also solve this problem.

owent avatar May 15 '24 05:05 owent