opentelemetry-cpp-contrib icon indicating copy to clipboard operation
opentelemetry-cpp-contrib copied to clipboard

Nginx instrumented with CPP otel contrib rejects requests without User-Agent field.

Open skowront opened this issue 1 year ago • 4 comments

Situation A NGINX 1.26.x or 1.25.x and NO OTEL cpp-contrib added. Requests made with curl, python httpclient are accepted. Requests made with dotnet 8.0 httpclient are acceepted.

Situation B NGINX 1.26.x or 1.25.x and OTEL cpp-contrib added. Requests made with curl, python httpclient are accepted. Requests made with dotnet 8.0 httpclient are REJECTED.

The following log is produced by .net:

An error occurred while sending the request.

The response ended prematurely. (ResponseEnded)

at System.Net.Http.HttpConnection.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)

at System.Net.Http.HttpConnection.<SendAsync>d__57.MoveNext() in //src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/HttpConnection.cs:line 862 at System.Net.Http.HttpConnectionPool.<SendWithVersionDetectionAndRetryAsync>d__89.MoveNext() in //src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/HttpConnectionPool.cs:line 1116 at System.Threading.Tasks.ValueTask`1.get_Result() in //src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/ValueTask.cs:line 812 at System.Net.Http.RedirectHandler.<SendAsync>d__4.MoveNext() in //src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/RedirectHandler.cs:line 30 at System.Net.Http.HttpClient.<GetStringAsyncCore>d__41.MoveNext() in /_/src/libraries/System.Net.Http/src/System/Net/Http/HttpClient.cs:line 188 at CSOTel.Traffic.CLI.Program.<Main>d__1.MoveNext() in C:\Users\tomek\source\repos\CSOTel\CSOTel.Traffic.CLI\Program.cs:line 35

The following is produced by nginx with otel cpp contrib:

2024/08/26 19:25:39 [error] 49#49: *10 mod_opentelemetry: startMonitoringRequest: Starting Request Monitoring for: / HTTP/1.1 Host, client: 10.0.2.2, server: www.cso.lab, request: "GET / HTTP/1.1", host: "cso.lab" 2024/08/26 19:25:39 [error] 49#49: *10 mod_opentelemetry: startMonitoringRequest: WebServer Context: NginxWebServerNetworkCSOTel.NginxWebServerNginxId, client: 10.0.2.2, server: www.cso.lab, request: "GET / HTTP/1.1", host: "cso.lab" 2024/08/26 19:25:39 [alert] 1#1: worker process 49 exited on signal 11 (core dumped)

While curl works perfecly fine and nginx serves the request. The problem is that curl automatically adds a user-agent header, but dotnet httpclient doesn't - and why should it?

curl -vk https://cso.lab/

  • Trying 192.168.56.1:443...
  • Connected to cso.lab (192.168.56.1) port 443
  • ALPN: curl offers h2,http/1.1
  • TLSv1.3 (OUT), TLS handshake, Client hello (1):
  • TLSv1.3 (IN), TLS handshake, Server hello (2):
  • TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
  • TLSv1.3 (IN), TLS handshake, Certificate (11):
  • TLSv1.3 (IN), TLS handshake, CERT verify (15):
  • TLSv1.3 (IN), TLS handshake, Finished (20):
  • TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
  • TLSv1.3 (OUT), TLS handshake, Finished (20):
  • SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
  • ALPN: server accepted http/1.1
  • Server certificate:
  • subject: CN=cso.lab
  • start date: Aug 17 11:53:50 2024 GMT
  • expire date: Aug 15 11:53:50 2034 GMT
  • issuer: CN=cso.lab
  • SSL certificate verify result: self-signed certificate (18), continuing anyway.
  • using HTTP/1.1

GET / HTTP/1.1 Host: cso.lab User-Agent: curl/8.4.0 Accept: /

  • TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
  • TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
  • old SSL session ID is stale, removing < HTTP/1.1 200 OK < Server: nginx/1.26.0 < Date: Mon, 26 Aug 2024 19:26:36 GMT < Content-Type: text/html < Content-Length: 2408 < Connection: keep-alive < Last-Modified: Sun, 25 Aug 2024 15:13:43 GMT < ETag: "66cb4a27-968" < Accept-Ranges: bytes <

FIX/SOLUTION/WORKAROUND Workaround is to add User-Agent header to dotnet httpclient (any value works), but the key must be present. Otherwise the nignx will reject the request.

NOTE This happens ONLY when nginx is instrumented with this cpp-contrib library! So it's clearly an issue with this solution - probably some kind of null exception is thrown underneath and even no TRACE is being sent to OTEL collector, because the worker thread is automatically killed.

skowront avatar Aug 26 '24 19:08 skowront

We have experienced the same. Any request without the user-agent header set is rejected. Even worst, it kills the worker process. It is easy enough to produce a DoS by miss configuring with this.

fede843 avatar Oct 01 '24 10:10 fede843

Which instrumentation are you using? The webserver module or nginx instrumentation?

seemk avatar Dec 04 '24 11:12 seemk

We were using the web server. Had to migrate to something else due to this.

fede843 avatar Dec 04 '24 23:12 fede843

Same.

skowront avatar Dec 09 '24 09:12 skowront