haproxy icon indicating copy to clipboard operation
haproxy copied to clipboard

HAProxy / Logs missing for interrupted client connections to the backend when HTTP2 is used on frontend

Open amsnek opened this issue 2 years ago • 7 comments

Detailed Description of the Problem

Testing with network inspection/interception showed that for intercepted (rejected/interrutped) connections by a network component between HAProxy and the backend Server -> HAProxy did not produce a Log when the Client (curl/chrome) used HTTP2 on the Frontend. When the H2 was disabled on the frontend -> the expected log was produced (502 / SH---). Running HAProxy in debug mode yielded no indication for that behavior. Since the only difference is H1/H2 and it should be all HTX internally, I suspect a bug?

no Log line when client terminates on frontend using http2:

curl --http2 --http2-prior-knowledge "https://test.domain.tld/test.asp?xss=%3Cscript%3Ealert("testing_curl")%3C/script%3E"
00000015:FE-TESTING.accept(0006)=000e from [10.10.10.10:49446] ALPN=h2
00000015:FE-TESTING.clireq[000e:ffffffff]: GET https://test.domain.tld/test.asp?xss=%3Cscript%3Ealert(testing_curl)%3C/script%3E HTTP/2.0
00000015:FE-TESTING.clihdr[000e:ffffffff]: host: test.domain.tld
00000015:FE-TESTING.clihdr[000e:ffffffff]: user-agent: curl/7.81.0
00000015:FE-TESTING.clihdr[000e:ffffffff]: accept: */*
00000015:BE-TESTING.srvcls[000e:000f]
00000015:BE-TESTING.clicls[000e:000f]
00000015:BE-TESTING.closed[000e:000f]
... repeated by curl 
curl: (56) Connection died, tried 5 times before giving up

testing with a browser (Chrome) -> ERR_HTTP2_SERVER_REFUSED_STREAM

howerver the expected log line appears when the client uses http/1.1 on the frontend: expected Log Line when using http/1.1

curl  "https://test.domain.tld/test.asp?xss=%3Cscript%3Ealert("testing_curl")%3C/script%3E"
00000000:FE-TESTING.accept(0004)=000b from [10.10.10.10:56692] ALPN=http/1.1
00000000:FE-TESTING.clireq[000b:ffffffff]: GET /test.asp?xss=%3Cscript%3Ealert(testing_curl)%3C/script%3E HTTP/1.1
00000000:FE-TESTING.clihdr[000b:ffffffff]: host: test.domain.tld
00000000:FE-TESTING.clihdr[000b:ffffffff]: user-agent: curl/8.0.1
00000000:FE-TESTING.clihdr[000b:ffffffff]: accept: */*
00000000:BE-TESTING.srvcls[000b:000c]
00000000:BE-TESTING.clicls[000b:000c]
00000000:BE-TESTING.closed[000b:000c]
10.10.10.10:56692 [08/Sep/2023:16:39:09.460] FE-TESTING~ BE-TESTING/test 0/0/6/-1/7 502 209 - - SH-- 1/1/0/0/0 0/0 "GET /test.asp?xss=%3Cscript%3Ealert(testing_curl)%3C/script%3E HTTP/1.1" 0/0000000000000000/0/0/0 test.domain.tld/TLSv1.2/ECDHE-ECDSA-AES128-GCM-SHA256

Expected Behavior

HAProxy logging with 502/SH-- or similiar when the connection to the backend is rejected/terminated independent of HTTP version used on the frontend.

Steps to Reproduce the Behavior

without an easy way to reproduce the interception component in between frontend and backend (firewall, WAF or similar), the setup cannot be easily reproduced -> will try to reproduce this behaviour with a less complicated setup.

Do you have any idea what may have caused this?

it seems using H2 on the frontend causes this behaviour, since it does not occur when the client terminates with H1 on the frontend (disabling h2 on frontend)

Do you have an idea how to solve the issue?

No response

What is your configuration?

# interception
global
        log 127.0.0.1 len 65535 local0 info
        chroot          /var/lib/haproxy
        pidfile         /var/run/haproxy.pid
        maxconn         80000
        daemon
        user            haproxy
        group           haproxy
        stats socket /var/run/haproxy.sock  mode 600 level admin expose-fd listeners
        stats timeout 30s
        ssl-default-bind-ciphersuites TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256
        ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets
        ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
        ssl-default-server-ciphersuites TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256
        ssl-default-server-options ssl-min-ver TLSv1.2 no-tls-tickets
        ssl-default-server-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
        ssl-mode-async
        tune.bufsize 131072
        tune.maxrewrite 65536
        tune.http.logurilen 2048

defaults
        mode            http
        log             global
        balance         leastconn
        option          redispatch

        timeout connect         10s
        timeout http-request    10s
        timeout http-keep-alive 30s
        timeout check           5s
        timeout queue           1m
        timeout server          1m
        timeout client          1m
        timeout client-fin      30s

frontend FE-TESTING
	# no log with h2
        bind 10.100.0.1:443 ssl crt /data/ssl/test.pem alpn h2,http/1.1 strict-sni
	# logging works as expected without h2
        #bind 10.100.0.1:443 ssl crt /data/ssl/test.pem alpn http/1.1 strict-sni
        mode http
        option forwardfor
        option http-keep-alive
        tcp-request inspect-delay 5s
        redirect scheme https if !{ ssl_fc }
        default_backend BE-TESTING

backend BE-TESTING
        mode http
        option http-keep-alive
        option abortonclose
        http-reuse safe
        server test 10.200.0.10:443 ssl verify none

Output of haproxy -vv

HAProxy version 2.6.15-446b02c 2023/08/09 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2027.
Known bugs: http://www.haproxy.org/bugs/bugs-2.6.15.html
Running on: Linux 4.18.0-477.21.1.el8_8.x86_64 #1 SMP Thu Jul 20 08:38:27 EDT 2023 x86_64
Build options :
  TARGET  = linux-glibc
  CPU     = generic
  CC      = cc
  CFLAGS  = -O2 -g -Wall -Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment -fstack-protector-strong -D_FORTIFY_SOURCE=2 -Werror=format-security
  OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1 USE_STATIC_PCRE2=1 USE_OPENSSL=1 USE_LUA=1 USE_TFO=1 USE_NS=1 USE_DEVICEATLAS=1 USE_SYSTEMD=1 USE_QUIC= USE_PROMEX=1
  DEBUG   = -DDEBUG_STRICT -DDEBUG_MEMORY_POOLS

Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H +DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE +LIBCRYPT +LINUX_SPLICE +LINUX_TPROXY +LUA -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL -OT -PCRE +PCRE2 +PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL +PROMEX -QUIC +RT +SLZ -STATIC_PCRE +STATIC_PCRE2 +SYSTEMD +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL -ZLIB

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=2).
Built with OpenSSL version : OpenSSL 3.0.10 1 Aug 2023
Running on OpenSSL version : OpenSSL 3.0.10 1 Aug 2023
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
OpenSSL providers loaded : default
Built with Lua version : Lua 5.4.6
Built with the Prometheus exporter as a service
Built with DeviceAtlas support (dummy library only).
Built with network namespace support.
Support for malloc_trim() is enabled.
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE2 version : 10.42 2022-12-11
PCRE2 library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with gcc compiler version 8.5.0 20210514 (Red Hat 8.5.0-18)

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
         h2 : mode=HTTP  side=FE|BE  mux=H2    flags=HTX|HOL_RISK|NO_UPG
       fcgi : mode=HTTP  side=BE     mux=FCGI  flags=HTX|HOL_RISK|NO_UPG
  <default> : mode=HTTP  side=FE|BE  mux=H1    flags=HTX
         h1 : mode=HTTP  side=FE|BE  mux=H1    flags=HTX|NO_UPG
  <default> : mode=TCP   side=FE|BE  mux=PASS  flags=
       none : mode=TCP   side=FE|BE  mux=PASS  flags=NO_UPG

Available services : prometheus-exporter
Available filters :
        [CACHE] cache
        [COMP] compression
        [FCGI] fcgi-app
        [SPOE] spoe
        [TRACE] trace

Last Outputs and Backtraces

No response

Additional Information

No response

amsnek avatar Sep 08 '23 14:09 amsnek

First I was surprised by your report. But, I've tested and indeed there is no log. It is a side effect of an old behavior about Keep-Alive connection to HTTP servers that have an unexpected side effect with the H2 (unexpected for me at least).

With persistant connections to servers, it is not uncommon to try to reuse an idle connection for a request when the server decides to close it. If it is not the first request for the same client connection, the client is closed and no error is returned to let the client retries. No log is emitted here to not pollute the logs. The client must retries. Thus, at the stage, the client retries by opening a new connection. And if the same happens on server side, it is now the first request for this connection, so a 502 is returned and a log message is emitted. It is how it works since the age for HTTP/1.1 clients.

But, for H2 clients, when the server connection error is detected, it is not possible to know if it is the first request or not for the connection. H2 streams are multiplexed thus we consider there is no first request. We are always in the retriable case. In this case a RESET_STREAM frame is sent to the client with the REFUSED_STREAM error. In this case, the client retries, but with the same connection. Thus the same will be repeated in loop until the client gives up because there is still no first request. And when the client gives up, it is just a H2 connection closed with no stream alive. No log is emitted.

I must check with @wtarreau to see what we can do here. Everything works as expected but having no log is indeed annoying.

capflam avatar Oct 31 '23 10:10 capflam

Thanks for investigating this! Having no log in this Scenario makes tracking and Investigation such (failed) requests practically impossible without further elements in the chain (and it does work with H1).

amsnek avatar Nov 02 '23 07:11 amsnek

bumping in hope this issue doesnt get lost in the void 🙈

amsnek avatar Mar 28 '24 14:03 amsnek

I finally found some time to talk with Willy about this issue. In fact, I was wong about my explanations and there is a true bug. The log must only be omitted if the error is triggered for a reused server connection. The current sate is inherited from old versions where there was no connection sharing and true idle connection management.

I will try to emit a fix soon. But the idea is really to emit a 502 and a log message if a connection error is encountered on a fresh server connection and skip it it is happens on an reused one to let the client retries.

capflam avatar Apr 08 '24 16:04 capflam

The fix was pushed in 3.0-DEV. It will be backported as usual. Thanks !

capflam avatar Apr 10 '24 13:04 capflam

Most awesome! Thanks for all your efforts! Will close the issue then

amsnek avatar Apr 10 '24 15:04 amsnek

We will keep the issue open, waiting for all backports.

capflam avatar Apr 10 '24 15:04 capflam