opentelemetry-python icon indicating copy to clipboard operation
opentelemetry-python copied to clipboard

tracer_provider.shutdown() does not provide a configurable timeout

Open Salazar-99 opened this issue 6 months ago • 1 comments

Is your feature request related to a problem?

I have tests that run to validate the instantiation and configuration of my apps global tracer provider. See the minimal example below:

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

provider = TracerProvider()
exporter = OTLPSpanExporter(**kwargs)
processor = BatchSpanProcessor(exporter)
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)

I notice that when I run tests with uv run pytest that verify this code, they never exit and I start to get errors about the endpoint being unavailable. I added a shutdown method to run after the tests

@pytest.fixture(scope="session", autouse=True)
def cleanup_otel():
    """Ensure OpenTelemetry tracer provider is properly shut down after tests."""
    yield
    # Shutdown the tracer provider to stop background threads
    tracer_provider = trace.get_tracer_provider()
    if hasattr(tracer_provider, 'shutdown'):
        tracer_provider.shutdown()

This takes 60 seconds to exit even though the tests pass in about 1 second.

Describe the solution you'd like

I'd like to be able to set a configurable timeout for the shutdown as follows:

 tracer_provider.shutdown(timeout=1) # forces the tracer provider to shutdown after 1 second

This would make unit testing my Otel configurations much faster.

Describe alternatives you've considered

No response

Additional Context

This seems to be somewhat related to this other issue https://github.com/open-telemetry/opentelemetry-python/issues/3309.

It also seems that the Log and Metrics exporter support a configurable timeout: https://opentelemetry-python.readthedocs.io/en/latest/_modules/opentelemetry/exporter/otlp/proto/grpc/metric_exporter.html#OTLPMetricExporter.shutdown

Would you like to implement a fix?

I am willing to help with some direction

Salazar-99 avatar Jun 07 '25 04:06 Salazar-99

I think https://github.com/open-telemetry/opentelemetry-python/pull/4564 will mostly fix this, it changes the timeout to encompass retries, so unless you are setting the exporter timeout to a huge value, export will quickly timeout instead of hanging in the retry loop.

If not I also have https://github.com/open-telemetry/opentelemetry-python/pull/4638 out for review. This PR forces the BatchSpan/LogRecord processor's to shutdown after 30 seconds, and also updates exporters to break out of the sleep in retry backoffs..

Making shutdown timeout configurable I also think is worth doing, but requires updating a bunch of interfaces which could be tricky.

DylanRussell avatar Jun 16 '25 17:06 DylanRussell

In order to make the timeout configurable we'd have to update SpanProcessor.shutdown, SpanProvider.shutdown (and also the Log equivalents if we want this for logs).. I think we've decided updating the interfaces is a breaking change, so we can only do it for logs..

We could update the implementations we own however without breaking people..

I think it should be the SpanProcessor's job to shutdown within a timeout, and not the exporter's job, so I don't think we need to plumb the timeout all the way through to the exporter.

DylanRussell avatar Jun 27 '25 14:06 DylanRussell