BatchSpanProcessor does not trigger export timeout
Describe the bug Despite the introduction of a delay that is longer than the timeout on both the exporter (using a wrapper class) and my custom-built collector, the export timeout does not affect the export call.
Steps to reproduce This was the setup I used to test the timeout on the exporter.
@Bean
public OpenTelemetrySdk autoconfiguredSdk() {
return AutoConfiguredOpenTelemetrySdk.builder()
.addPropertiesSupplier(() -> properties)
.addSpanExporterCustomizer((spanExporter, configProperties) -> new CustomSpanExporter(spanExporter))
.setResultAsGlobal()
.build()
.getOpenTelemetrySdk();
}
@AllArgsConstructor
public class CustomSpanExporter implements SpanExporter {
private final SpanExporter delegate;
@Override
public CompletableResultCode export(Collection<SpanData> spans) {
try {
Thread.sleep(35000);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException("Sleep interrupted", e);
}
return delegate.export(spans);
}
@Override
public CompletableResultCode flush() {
return delegate.flush();
}
@Override
public CompletableResultCode shutdown() {
return delegate.shutdown();
}
@Override
public void close() {
delegate.close();
}
}
The setup for the collector involves creating a collector server (in my case, a gRPC one) with a timeout immediately after receiving the request.
Although otel.exporter.otlp.timeout, by default, is 10 seconds (lower than otel.bsp.export.timeout's default), meaning the HTTP request times out before the exporter does, I have tested with a lower export timeout, in which case the request finalises normally.
What did you expect to see?
I expected to have my export interrupted after the value set for otel.bsp.export.timeout (by default, 30 seconds).
What did you see instead? Instead, the export progressed as normal.
What version and what artifacts are you using?
io.opentelemetry.instrumentation:opentelemetry-spring-boot-starter:jar:2.13.1
io.opentelemetry:opentelemetry-sdk:jar:1.47.0
My pom.xml:
<dependency>
<groupId>io.opentelemetry.instrumentation</groupId>
<artifactId>opentelemetry-spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-sdk-extension-autoconfigure</artifactId>
</dependency>
Environment Oracle Linux Server 8.9 openjdk version "17.0.14" 2025-01-21 LTS OpenJDK Runtime Environment (Red_Hat-17.0.14.0.7-3.0.1) (build 17.0.14+7-LTS) OpenJDK 64-Bit Server VM (Red_Hat-17.0.14.0.7-3.0.1) (build 17.0.14+7-LTS, mixed mode, sharing)
Additional context If this is the expected behaviour, any help on how to test this property would be appreciated.
I looked into this issue and the reason the timeout never triggers is that BatchSpanProcessor calls spanExporter.export() synchronously on the worker thread. Since the exporter blocks (e.g., Thread.sleep(35000)), the worker thread never reaches the result.join(exportTimeout) call, so the timeout logic cannot run.
To enforce the timeout, the exporter call needs to run in a separate thread (via ExecutorService), so the worker can do future.get(timeout) and cancel/interrupt it when the timeout expires.
If this approach is acceptable, I can prepare a patch.
I'd be willing to look at a patch for this, but I don't think it's a terribly important bug, TBH. blocking the export thread seems like a bad idea, and isn't a particularly realistic test case. Is this actually impacting real production use-cases?
Got it, thanks for the clarification. And no — this isn’t impacting a real production setup. I was testing the timeout behavior and noticed it never triggers.