opentelemetry-cpp
opentelemetry-cpp copied to clipboard
Crash in http exporter, for tcp half open connections
Steps to reproduce Sudden bring down of otel collector in a high load setup.
What is the expected behavior? Clients handling export failures
What is the actual behavior? Hang on export, 30+ seconds. We can see one connection stuck in half open, one stuck in connect SYN_SENT... note that server is gone for some time at this point.
netstat -nap | grep 4318 tcp 0 0 192.168.67.174:51510 192.168.10.101:4318 ESTABLISHED 23/smfcc tcp 0 1 192.168.67.174:43912 192.168.10.101:4318 SYN_SENT -
Additional context We suspect this crash is due to otel not providing way to set keep alives on client connections (CURLOPT_TCP_KEEPALIVE)... we've set timeout (CURLOPT_TIMEOUT) to 5s, and ideally for that CURLOPT_TCP_KEEPALIVE should be set to 1s, but otel doesn't have an option to set that.
Request that a config for TCP_KEEPALIVE be provided.