tracer_provider.shutdown() does not provide a configurable timeout
Is your feature request related to a problem?
I have tests that run to validate the instantiation and configuration of my apps global tracer provider. See the minimal example below:
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
provider = TracerProvider()
exporter = OTLPSpanExporter(**kwargs)
processor = BatchSpanProcessor(exporter)
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
I notice that when I run tests with uv run pytest that verify this code, they never exit and I start to get errors about the endpoint being unavailable. I added a shutdown method to run after the tests
@pytest.fixture(scope="session", autouse=True)
def cleanup_otel():
"""Ensure OpenTelemetry tracer provider is properly shut down after tests."""
yield
# Shutdown the tracer provider to stop background threads
tracer_provider = trace.get_tracer_provider()
if hasattr(tracer_provider, 'shutdown'):
tracer_provider.shutdown()
This takes 60 seconds to exit even though the tests pass in about 1 second.
Describe the solution you'd like
I'd like to be able to set a configurable timeout for the shutdown as follows:
tracer_provider.shutdown(timeout=1) # forces the tracer provider to shutdown after 1 second
This would make unit testing my Otel configurations much faster.
Describe alternatives you've considered
No response
Additional Context
This seems to be somewhat related to this other issue https://github.com/open-telemetry/opentelemetry-python/issues/3309.
It also seems that the Log and Metrics exporter support a configurable timeout: https://opentelemetry-python.readthedocs.io/en/latest/_modules/opentelemetry/exporter/otlp/proto/grpc/metric_exporter.html#OTLPMetricExporter.shutdown
Would you like to implement a fix?
I am willing to help with some direction
I think https://github.com/open-telemetry/opentelemetry-python/pull/4564 will mostly fix this, it changes the timeout to encompass retries, so unless you are setting the exporter timeout to a huge value, export will quickly timeout instead of hanging in the retry loop.
If not I also have https://github.com/open-telemetry/opentelemetry-python/pull/4638 out for review. This PR forces the BatchSpan/LogRecord processor's to shutdown after 30 seconds, and also updates exporters to break out of the sleep in retry backoffs..
Making shutdown timeout configurable I also think is worth doing, but requires updating a bunch of interfaces which could be tricky.
In order to make the timeout configurable we'd have to update SpanProcessor.shutdown, SpanProvider.shutdown (and also the Log equivalents if we want this for logs).. I think we've decided updating the interfaces is a breaking change, so we can only do it for logs..
We could update the implementations we own however without breaking people..
I think it should be the SpanProcessor's job to shutdown within a timeout, and not the exporter's job, so I don't think we need to plumb the timeout all the way through to the exporter.