haproxy icon indicating copy to clipboard operation
haproxy copied to clipboard

http-server-close and httpclose has no effect on HAProxy as forward proxy

Open victornor opened this issue 1 year ago • 6 comments

Detailed Description of the Problem

I'm using HAProxy in http mode as a forward proxy, to loadbalance requests across 10 different upstream http proxies. When request succeeds, HAProxy will keep forwarding the connection through the same proxy, ignoring loadbalancing, http-server-close and httpclose.

This does not happen if the connection to the selected proxy fails.

Expected Behavior

When using option http-server-close or option httpclose, i expect HAProxy to forward each request on the same client connection to a different upstream proxy in my backend.

Steps to Reproduce the Behavior

  1. Use the default haproxy configuration.
  2. Add option http-server-close wherever applicable.
  3. Add 3 proxies to the backend (2 not working and 1 working in that order) with verify none. (mitmproxy will be fine here)
  4. Add roundrobin balancing to the backend
  5. Use haproxy as http proxy in firefox (or another browser) or use curl with --next to send multiple requests on the same connection (add -k for insecure if using mitmproxy as test).

curl --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io

You will notice that first 2 requests fail (because backend proxy 1 and 2 are invalid) and every following request succeeds, because HAProxy for some reason continues to use the same proxy (the 1/3 that are actually working).

` curl: (56) CONNECT tunnel failed, response 503 curl: (56) CONNECT tunnel failed, response 503

xxx.150.128.65 xxx.150.128.65 xxx.150.128.65 xxx.150.128.65 xxx.150.128.65 xxx.150.128.65 xxx.150.128.65 `

This matches the haproxy logs too: ` 127.0.0.1:35368 [06/Jul/2024:03:23:18.138] main app/app3 0/0/-1/-1/3006 503 217 - - SC-- 2/2/1/0/3 0/0 "CONNECT ip.oxylabs.io:443 HTTP/1.1"

127.0.0.1:35376 [06/Jul/2024:03:23:21.146] main app/app4 0/0/-1/-1/3005 503 217 - - SC-- 2/2/1/0/3 0/0 "CONNECT ip.oxylabs.io:443 HTTP/1.1"

127.0.0.1:35388 [06/Jul/2024:03:23:24.153] main app/app1 0/0/0/38/326 200 2861 - - ---- 2/2/1/0/0 0/0 "CONNECT ip.oxylabs.io:443 HTTP/1.1" `

Do you have any idea what may have caused this?

No response

Do you have an idea how to solve the issue?

Only way for me to solve this is by closing connection to HAProxy after every request, but i would like to be able to keep-alive on the client and use http-server-close to close the connection to the upstream proxy after every request.

I have tried HAProxy versions from 1.9 to 3.1 with no success.

What is your configuration?

#---------------------------------------------------------------------
# Example configuration.  See the full configuration manual online.
#
#   http://www.haproxy.org/download/2.5/doc/configuration.txt
#
#---------------------------------------------------------------------

global
    maxconn     20000
    log         127.0.0.1 local0
    user        haproxy
    chroot      /usr/share/haproxy
    pidfile     /run/haproxy.pid
    daemon

frontend  main
    bind :5000
    mode                 http
    log                  global
    option               httplog
    option               dontlognull
    option forwardfor    except 127.0.0.0/8
    maxconn              8000
    timeout              client  30s

    default_backend             app

backend app
    mode        http
    balance     roundrobin
    timeout     connect 5s
    timeout     server  30s
    timeout     queue   30s
    option http-server-close
 
    server  app3 127.0.0.1:5003 verify none
    server  app4 127.0.0.1:5004 verify none
    server  mitmproxy 127.0.0.1:8080 verify none

Output of haproxy -vv

HAProxy version 3.0.2 2024/06/14 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2029.
Known bugs: http://www.haproxy.org/bugs/bugs-3.0.2.html
Running on: Linux 6.6.34-1-MANJARO #1 SMP PREEMPT_DYNAMIC Wed Jun 19 19:00:06 UTC 2024 x86_64
Build options :
  TARGET  = linux-glibc
  CC      = cc
  CFLAGS  = -O2 -g -fwrapv -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=3 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -g -ffile-prefix-map=/build/haproxy/src=/usr/src/debug/haproxy -flto=auto -fwrapv
  OPTIONS = USE_GETADDRINFO=1 USE_OPENSSL=1 USE_LUA=1 USE_ZLIB=1 USE_SYSTEMD=1 USE_PROMEX=1 USE_PCRE2=1 USE_PCRE2_JIT=1
  DEBUG   = 

Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE -LIBATOMIC +LIBCRYPT +LINUX_CAP +LINUX_SPLICE +LINUX_TPROXY +LUA +MATH -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL -OPENSSL_AWSLC -OPENSSL_WOLFSSL -OT -PCRE +PCRE2 +PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL +PROMEX -PTHREAD_EMULATION -QUIC -QUIC_OPENSSL_COMPAT +RT +SHM_OPEN -SLZ +SSL -STATIC_PCRE -STATIC_PCRE2 +SYSTEMD +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL +ZLIB

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, default=32).
Built with OpenSSL version : OpenSSL 3.3.1 4 Jun 2024
Running on OpenSSL version : OpenSSL 3.3.1 4 Jun 2024
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
OpenSSL providers loaded : default
Built with Lua version : Lua 5.4.6
Built with the Prometheus exporter as a service
Built with network namespace support.
Built with zlib version : 1.3.1
Running on zlib version : 1.3.1
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE2 version : 10.44 2024-06-07
PCRE2 library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with gcc compiler version 14.1.1 20240522

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
         h2 : mode=HTTP  side=FE|BE  mux=H2    flags=HTX|HOL_RISK|NO_UPG
         h1 : mode=HTTP  side=FE|BE  mux=H1    flags=HTX|NO_UPG
  <default> : mode=HTTP  side=FE|BE  mux=H1    flags=HTX
       fcgi : mode=HTTP  side=BE     mux=FCGI  flags=HTX|HOL_RISK|NO_UPG
       none : mode=TCP   side=FE|BE  mux=PASS  flags=NO_UPG
  <default> : mode=TCP   side=FE|BE  mux=PASS  flags=

Available services : prometheus-exporter
Available filters :
	[BWLIM] bwlim-in
	[BWLIM] bwlim-out
	[CACHE] cache
	[COMP] compression
	[FCGI] fcgi-app
	[SPOE] spoe
	[TRACE] trace

Last Outputs and Backtraces

No response

Additional Information

No response

victornor avatar Jul 05 '24 22:07 victornor

Here, the tunnel will be established only once. You are using the same proxy, thus on curl side, only one connection is established and only one CONNECT is performed. On HAProxy side, once the tunnel is successfully established (when the 200-OK is received from the forward proxy), the connection is switched from HTTP to RAW tcp. At this stage, http options are not applicable anymore. The tunnel will be kept open until the client or the server closes it. HAProxy will not be aware of the requests and responses exchanged. It will act as a TCP proxy. With your exemple, in your logs, excluding the 503s, you will have only one line.

So, here there is no other solution than closing the connection between each request on the client side.

capflam avatar Jul 08 '24 06:07 capflam

Or maybe you didn't intend to use CONNECT in the first place ? Because CONNECT is solely used to establish a tunnel (i.e. make sure the proxy doesn't see what you pass into it). I don't know if this is what you were looking for, but in some sort you should see this as a VPN. You can easily imagine that the VPN gateway on your network will not randomly guess where a request ends and where a new one starts. Here it's the same.

wtarreau avatar Jul 09 '24 06:07 wtarreau

Or maybe you didn't intend to use CONNECT in the first place ? Because CONNECT is solely used to establish a tunnel (i.e. make sure the proxy doesn't see what you pass into it). I don't know if this is what you were looking for, but in some sort you should see this as a VPN. You can easily imagine that the VPN gateway on your network will not randomly guess where a request ends and where a new one starts. Here it's the same.

Ideally I was hoping I could make haproxy just forward the request, instead of establishing a a raw TCP connection to the client (curl etc.). But I don't think that's possible? So I could keep the connection between client and haproxy alive, but balance the http requests amongst the backend servers.

victornor avatar Jul 09 '24 13:07 victornor

It's possible to forward the request to haproxy but then you must not use CONNECT. CONNECT specifically asks for a tunnel meant to hide requests. I'm not sure exactly what you're trying to do. If you're just trying to use haproxy as a forward proxy, just use it this way without tunnels. You'll need to use some configuration so that haproxy resolves host names to IP addresses (see the do_resolve() action) like any proxy would do, unless of course you chain haproxy directly to another forward proxy. But what you need to avoid for haproxy to see the requests, is to make a tunnel through it.

wtarreau avatar Jul 09 '24 14:07 wtarreau

It's possible to forward the request to haproxy but then you must not use CONNECT. CONNECT specifically asks for a tunnel meant to hide requests. I'm not sure exactly what you're trying to do. If you're just trying to use haproxy as a forward proxy, just use it this way without tunnels. You'll need to use some configuration so that haproxy resolves host names to IP addresses (see the do_resolve() action) like any proxy would do, unless of course you chain haproxy directly to another forward proxy. But what you need to avoid for haproxy to see the requests, is to make a tunnel through it.

So let's say i'm using curl. You're proposing that instead of of doing curl --proxy "http://haproxy:port" https://somedomain.com/path/to/something I could do curl https://haproxy:port/somedomain.com/path/to/something

And then haproxy could take the http GET request and forward it through one of my upstream (http) proxies in the backend? What would the haproxy configuration look like to do something like that?

Currently i have something like this

backend app mode http balance roundrobin timeout connect 5s timeout server 30s timeout queue 30s option http-server-close server app3 127.0.0.1:5003 verify none server app4 127.0.0.1:5004 verify none server mitmproxy 127.0.0.1:8080 verify none

These are all regular http proxies.

Thanks

victornor avatar Jul 09 '24 15:07 victornor

Sorry, I did not notice you were doing HTTPS, since you were asking for haproxy to "see" the delimitation between requests and responses. Unfortunately, despite some of us having been advocating for the "GET https://" approach as we used to call it a while ago, there's no such standard, and right now the only way to transport https over a proxy is the CONNECT method, which creates a totally opaque tunnel.

So due to using https, you're doomed. The sole purpose of HTTPS precisely is to make sure that no other component between your client and the target server sees nor knows what you're doing. That includes the proxies involved in the transfer. HTTPS means trust nobody in the chain, not even the proxies. There's no notion of request nor response there, it's an opaque stream of encrypted bytes that is being transported over haproxy.

wtarreau avatar Jul 10 '24 08:07 wtarreau

Hello! Since nothing else is expected at this point, I'm closing now.

wtarreau avatar Sep 05 '24 15:09 wtarreau