haproxy
haproxy copied to clipboard
http-server-close and httpclose has no effect on HAProxy as forward proxy
Detailed Description of the Problem
I'm using HAProxy in http mode as a forward proxy, to loadbalance requests across 10 different upstream http proxies. When request succeeds, HAProxy will keep forwarding the connection through the same proxy, ignoring loadbalancing, http-server-close and httpclose.
This does not happen if the connection to the selected proxy fails.
Expected Behavior
When using option http-server-close or option httpclose, i expect HAProxy to forward each request on the same client connection to a different upstream proxy in my backend.
Steps to Reproduce the Behavior
- Use the default haproxy configuration.
- Add option http-server-close wherever applicable.
- Add 3 proxies to the backend (2 not working and 1 working in that order) with verify none. (mitmproxy will be fine here)
- Add roundrobin balancing to the backend
- Use haproxy as http proxy in firefox (or another browser) or use curl with --next to send multiple requests on the same connection (add -k for insecure if using mitmproxy as test).
curl --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io --next --proxy "http://127.0.0.1:5000" -k https://ip.oxylabs.io
You will notice that first 2 requests fail (because backend proxy 1 and 2 are invalid) and every following request succeeds, because HAProxy for some reason continues to use the same proxy (the 1/3 that are actually working).
` curl: (56) CONNECT tunnel failed, response 503 curl: (56) CONNECT tunnel failed, response 503
xxx.150.128.65 xxx.150.128.65 xxx.150.128.65 xxx.150.128.65 xxx.150.128.65 xxx.150.128.65 xxx.150.128.65 `
This matches the haproxy logs too: ` 127.0.0.1:35368 [06/Jul/2024:03:23:18.138] main app/app3 0/0/-1/-1/3006 503 217 - - SC-- 2/2/1/0/3 0/0 "CONNECT ip.oxylabs.io:443 HTTP/1.1"
127.0.0.1:35376 [06/Jul/2024:03:23:21.146] main app/app4 0/0/-1/-1/3005 503 217 - - SC-- 2/2/1/0/3 0/0 "CONNECT ip.oxylabs.io:443 HTTP/1.1"
127.0.0.1:35388 [06/Jul/2024:03:23:24.153] main app/app1 0/0/0/38/326 200 2861 - - ---- 2/2/1/0/0 0/0 "CONNECT ip.oxylabs.io:443 HTTP/1.1" `
Do you have any idea what may have caused this?
No response
Do you have an idea how to solve the issue?
Only way for me to solve this is by closing connection to HAProxy after every request, but i would like to be able to keep-alive on the client and use http-server-close to close the connection to the upstream proxy after every request.
I have tried HAProxy versions from 1.9 to 3.1 with no success.
What is your configuration?
#---------------------------------------------------------------------
# Example configuration. See the full configuration manual online.
#
# http://www.haproxy.org/download/2.5/doc/configuration.txt
#
#---------------------------------------------------------------------
global
maxconn 20000
log 127.0.0.1 local0
user haproxy
chroot /usr/share/haproxy
pidfile /run/haproxy.pid
daemon
frontend main
bind :5000
mode http
log global
option httplog
option dontlognull
option forwardfor except 127.0.0.0/8
maxconn 8000
timeout client 30s
default_backend app
backend app
mode http
balance roundrobin
timeout connect 5s
timeout server 30s
timeout queue 30s
option http-server-close
server app3 127.0.0.1:5003 verify none
server app4 127.0.0.1:5004 verify none
server mitmproxy 127.0.0.1:8080 verify none
Output of haproxy -vv
HAProxy version 3.0.2 2024/06/14 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2029.
Known bugs: http://www.haproxy.org/bugs/bugs-3.0.2.html
Running on: Linux 6.6.34-1-MANJARO #1 SMP PREEMPT_DYNAMIC Wed Jun 19 19:00:06 UTC 2024 x86_64
Build options :
TARGET = linux-glibc
CC = cc
CFLAGS = -O2 -g -fwrapv -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=3 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -g -ffile-prefix-map=/build/haproxy/src=/usr/src/debug/haproxy -flto=auto -fwrapv
OPTIONS = USE_GETADDRINFO=1 USE_OPENSSL=1 USE_LUA=1 USE_ZLIB=1 USE_SYSTEMD=1 USE_PROMEX=1 USE_PCRE2=1 USE_PCRE2_JIT=1
DEBUG =
Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE -LIBATOMIC +LIBCRYPT +LINUX_CAP +LINUX_SPLICE +LINUX_TPROXY +LUA +MATH -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL -OPENSSL_AWSLC -OPENSSL_WOLFSSL -OT -PCRE +PCRE2 +PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL +PROMEX -PTHREAD_EMULATION -QUIC -QUIC_OPENSSL_COMPAT +RT +SHM_OPEN -SLZ +SSL -STATIC_PCRE -STATIC_PCRE2 +SYSTEMD +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL +ZLIB
Default settings :
bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, default=32).
Built with OpenSSL version : OpenSSL 3.3.1 4 Jun 2024
Running on OpenSSL version : OpenSSL 3.3.1 4 Jun 2024
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
OpenSSL providers loaded : default
Built with Lua version : Lua 5.4.6
Built with the Prometheus exporter as a service
Built with network namespace support.
Built with zlib version : 1.3.1
Running on zlib version : 1.3.1
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE2 version : 10.44 2024-06-07
PCRE2 library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with gcc compiler version 14.1.1 20240522
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.
Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
h2 : mode=HTTP side=FE|BE mux=H2 flags=HTX|HOL_RISK|NO_UPG
h1 : mode=HTTP side=FE|BE mux=H1 flags=HTX|NO_UPG
<default> : mode=HTTP side=FE|BE mux=H1 flags=HTX
fcgi : mode=HTTP side=BE mux=FCGI flags=HTX|HOL_RISK|NO_UPG
none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG
<default> : mode=TCP side=FE|BE mux=PASS flags=
Available services : prometheus-exporter
Available filters :
[BWLIM] bwlim-in
[BWLIM] bwlim-out
[CACHE] cache
[COMP] compression
[FCGI] fcgi-app
[SPOE] spoe
[TRACE] trace
Last Outputs and Backtraces
No response
Additional Information
No response
Here, the tunnel will be established only once. You are using the same proxy, thus on curl side, only one connection is established and only one CONNECT is performed. On HAProxy side, once the tunnel is successfully established (when the 200-OK is received from the forward proxy), the connection is switched from HTTP to RAW tcp. At this stage, http options are not applicable anymore. The tunnel will be kept open until the client or the server closes it. HAProxy will not be aware of the requests and responses exchanged. It will act as a TCP proxy. With your exemple, in your logs, excluding the 503s, you will have only one line.
So, here there is no other solution than closing the connection between each request on the client side.
Or maybe you didn't intend to use CONNECT in the first place ? Because CONNECT is solely used to establish a tunnel (i.e. make sure the proxy doesn't see what you pass into it). I don't know if this is what you were looking for, but in some sort you should see this as a VPN. You can easily imagine that the VPN gateway on your network will not randomly guess where a request ends and where a new one starts. Here it's the same.
Or maybe you didn't intend to use CONNECT in the first place ? Because CONNECT is solely used to establish a tunnel (i.e. make sure the proxy doesn't see what you pass into it). I don't know if this is what you were looking for, but in some sort you should see this as a VPN. You can easily imagine that the VPN gateway on your network will not randomly guess where a request ends and where a new one starts. Here it's the same.
Ideally I was hoping I could make haproxy just forward the request, instead of establishing a a raw TCP connection to the client (curl etc.). But I don't think that's possible? So I could keep the connection between client and haproxy alive, but balance the http requests amongst the backend servers.
It's possible to forward the request to haproxy but then you must not use CONNECT. CONNECT specifically asks for a tunnel meant to hide requests. I'm not sure exactly what you're trying to do. If you're just trying to use haproxy as a forward proxy, just use it this way without tunnels. You'll need to use some configuration so that haproxy resolves host names to IP addresses (see the do_resolve() action) like any proxy would do, unless of course you chain haproxy directly to another forward proxy. But what you need to avoid for haproxy to see the requests, is to make a tunnel through it.
It's possible to forward the request to haproxy but then you must not use CONNECT. CONNECT specifically asks for a tunnel meant to hide requests. I'm not sure exactly what you're trying to do. If you're just trying to use haproxy as a forward proxy, just use it this way without tunnels. You'll need to use some configuration so that haproxy resolves host names to IP addresses (see the do_resolve() action) like any proxy would do, unless of course you chain haproxy directly to another forward proxy. But what you need to avoid for haproxy to see the requests, is to make a tunnel through it.
So let's say i'm using curl. You're proposing that instead of of doing
curl --proxy "http://haproxy:port" https://somedomain.com/path/to/something
I could do
curl https://haproxy:port/somedomain.com/path/to/something
And then haproxy could take the http GET request and forward it through one of my upstream (http) proxies in the backend? What would the haproxy configuration look like to do something like that?
Currently i have something like this
backend app mode http balance roundrobin timeout connect 5s timeout server 30s timeout queue 30s option http-server-close server app3 127.0.0.1:5003 verify none server app4 127.0.0.1:5004 verify none server mitmproxy 127.0.0.1:8080 verify none
These are all regular http proxies.
Thanks
Sorry, I did not notice you were doing HTTPS, since you were asking for haproxy to "see" the delimitation between requests and responses. Unfortunately, despite some of us having been advocating for the "GET https://" approach as we used to call it a while ago, there's no such standard, and right now the only way to transport https over a proxy is the CONNECT method, which creates a totally opaque tunnel.
So due to using https, you're doomed. The sole purpose of HTTPS precisely is to make sure that no other component between your client and the target server sees nor knows what you're doing. That includes the proxies involved in the transfer. HTTPS means trust nobody in the chain, not even the proxies. There's no notion of request nor response there, it's an opaque stream of encrypted bytes that is being transported over haproxy.
Hello! Since nothing else is expected at this point, I'm closing now.