opentelemetry-cpp icon indicating copy to clipboard operation
opentelemetry-cpp copied to clipboard

Crash in http exporter, for tcp half open connections

Open VivekSubr opened this issue 6 months ago • 1 comments

Steps to reproduce Sudden bring down of otel collector in a high load setup.

What is the expected behavior? Clients handling export failures

What is the actual behavior? Hang on export, 30+ seconds. We can see one connection stuck in half open, one stuck in connect SYN_SENT... note that server is gone for some time at this point.

netstat -nap | grep 4318 tcp 0 0 192.168.67.174:51510 192.168.10.101:4318 ESTABLISHED 23/smfcc tcp 0 1 192.168.67.174:43912 192.168.10.101:4318 SYN_SENT -

Additional context We suspect this crash is due to otel not providing way to set keep alives on client connections (CURLOPT_TCP_KEEPALIVE)... we've set timeout (CURLOPT_TIMEOUT) to 5s, and ideally for that CURLOPT_TCP_KEEPALIVE should be set to 1s, but otel doesn't have an option to set that.

Request that a config for TCP_KEEPALIVE be provided.

VivekSubr avatar Apr 01 '25 04:04 VivekSubr