haproxy icon indicating copy to clipboard operation
haproxy copied to clipboard

Haproxy failed to process some gRPC responses

Open nikola-sh opened this issue 1 year ago • 10 comments

Detailed Description of the Problem

Haproxy between gRPC client and gRPC server fails to process some server responses and finishes H2 session. A session with this problem was captured by tcpdump: see real.dump.zip Port 3990 - haproxy Port 3991 - gRPC server GOAWAY frame is unexpected.

Some debugging shows it was sent by this code from mux_h2.c:

if (b_full(&h2c->dbuf) && h2c->dfl) {
		/* too large frames */
		h2c_error(h2c, H2_ERR_INTERNAL_ERROR);
		ret = -1;
	}

Expected Behavior

Response from gRPC server seems correct and should be handled by Haproxy.

Steps to Reproduce the Behavior

I tried to reproduce a problem by vtest. I didn't get exact behavior as in the dump but error from mux_h2.c

        if (b_full(&h2c->dbuf) && h2c->dfl) {
		/* too large frames */
		h2c_error(h2c, H2_ERR_INTERNAL_ERROR);
		ret = -1;
	}

was got.

varnishtest "A test to reveal h2 buffer bug"

feature ignore_unknown_macro

server s1 {
    rxpri
    
    stream 0 {
		txsettings
		rxsettings
		txsettings -ack
		rxsettings
		expect settings.ack == true
	} -run
    
    stream 1 {
        rxreq       
        txresp -noadd -idxHdr 8 -litHdr inc plain "content-type" plain "application/grpc" -litHdr inc plain "grpc-accept-encoding" plain "identity, deflate, gzip" -nostrend
        txdata -datalen 226 -nostrend
        txresp -noadd -litHdr inc plain "grpc-status" plain "0"
    } -run
    
    stream 3 {
        rxreq       
        txresp -noadd -idxHdr 8 -idxHdr 64 -idxHdr 63 -nostrend
        txdata -datalen 15964 -nostrend
        txresp -noadd -idxHdr 62
    } -run
    
    stream 5 {
        rxreq       
        txresp -noadd -idxHdr 8 -idxHdr 64 -idxHdr 63 -nostrend
        txdata -datalen 16384 -nostrend
        txresp -noadd -idxHdr 62
    } -run
} -start

haproxy h1 -conf {
    global
        log stderr local0 debug

    defaults
        log global
        mode http
        option httplog
        maxconn 1000
        timeout connect 5s
        timeout client  5s
        timeout server  5s

    frontend fe
        bind "fd@${fe}" proto h2
        default_backend be_servers

    backend be_servers
        server srv1 ${s1_addr}:${s1_port} proto h2
} -start

client c1 -connect ${h1_fe_sock} {
    txpri
    
    stream 0 {
		txsettings
		rxsettings
		txsettings -ack
		rxsettings
		expect settings.ack == true
	} -run
    
    stream 1 {
        txreq -method POST -url "/1111"
        
        rxhdrs 
        expect resp.status == 200
        expect resp.http.content-type == "application/grpc"
        expect resp.http.grpc-accept-encoding == "identity, deflate, gzip"
        
        rxdata -some 1
        
        rxhdrs
        expect resp.http.grpc-status == 0
    } -run
      
    stream 3 {
        txreq -method POST -url "/2222"
    
        rxhdrs 
        expect resp.status == 200
        expect resp.http.content-type == "application/grpc"
        expect resp.http.grpc-accept-encoding == "identity, deflate, gzip"
        
        rxdata -some 2
        
        rxhdrs
        expect resp.http.grpc-status == 0
    } -start
    
    stream 5 {
        txreq -method POST -url "/3333"
    
        rxhdrs 
        expect resp.status == 200
        expect resp.http.content-type == "application/grpc"
        expect resp.http.grpc-accept-encoding == "identity, deflate, gzip"
        
        rxdata -some 2
        
        rxhdrs
        expect resp.http.grpc-status == 0
    } -start
} -run

Output is here: vtest_output.txt

Do you have any idea what may have caused this?

No response

Do you have an idea how to solve the issue?

No response

What is your configuration?

Linux on WSL2 (Windows subsystem for Windows).

Output of haproxy -vv

HAProxy version 3.0-dev0 2023/12/05 - https://haproxy.org/
Status: development branch - not safe for use in production.
Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open
Running on: Linux 5.15.133.1-microsoft-standard-WSL2 #1 SMP Thu Oct 5 21:02:42 UTC 2023 x86_64
Build options :
  TARGET  = linux-glibc
  CPU     = generic
  CC      = cc
  CFLAGS  = -O2 -g -Wall -Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment
  OPTIONS = USE_OPENSSL= USE_LUA= USE_SYSTEMD= USE_PCRE=
  DEBUG   = -DDEBUG_STRICT -DDEBUG_MEMORY_POOLS

Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE -LIBATOMIC +LIBCRYPT +LINUX_CAP +LINUX_SPLICE +LINUX_TPROXY -LUA -MATH -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER -OPENSSL -OPENSSL_AWSLC -OPENSSL_WOLFSSL -OT -PCRE -PCRE2 -PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL -PROMEX -PTHREAD_EMULATION -QUIC -QUIC_OPENSSL_COMPAT +RT +SHM_OPEN +SLZ -SSL -STATIC_PCRE -STATIC_PCRE2 -SYSTEMD +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL -ZLIB

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, default=8).
Built with network namespace support.
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built without PCRE or PCRE2 support (using libc's regex instead)
Encrypted password support via crypt(3): yes
Built with gcc compiler version 11.4.0

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
         h2 : mode=HTTP  side=FE|BE  mux=H2    flags=HTX|HOL_RISK|NO_UPG
       fcgi : mode=HTTP  side=BE     mux=FCGI  flags=HTX|HOL_RISK|NO_UPG
  <default> : mode=HTTP  side=FE|BE  mux=H1    flags=HTX
         h1 : mode=HTTP  side=FE|BE  mux=H1    flags=HTX|NO_UPG
  <default> : mode=TCP   side=FE|BE  mux=PASS  flags=
       none : mode=TCP   side=FE|BE  mux=PASS  flags=NO_UPG

Available services : none

Available filters :
        [BWLIM] bwlim-in
        [BWLIM] bwlim-out
        [CACHE] cache
        [COMP] compression
        [FCGI] fcgi-app
        [SPOE] spoe
        [TRACE] trace

Last Outputs and Backtraces

No response

Additional Information

No response

nikola-sh avatar Dec 12 '23 14:12 nikola-sh

Indeed, there is an issue with the trailers parsing when some data are still present in the rxbuf. I suspect all stable versions are affected.

capflam avatar Dec 12 '23 16:12 capflam

I pushed a fix but I referenced the wrong issue in the commit message. Here is the fix: 682f73b4fa6d76aa0b5b743fe92777822884772d

capflam avatar Dec 13 '23 15:12 capflam

We will emit a 2.9.1 tomorrow. Have you any way to confirm the above patch fixes your issue by testing the 3.0-DEV ?

capflam avatar Dec 14 '23 09:12 capflam

Hello! Thank you for the fix. It resolves my issue.

nikola-sh avatar Dec 14 '23 11:12 nikola-sh

Thanks for the feedback.

capflam avatar Dec 14 '23 11:12 capflam

@capflam I don't know if it is related to the current issue or not but i get strange behavior with trailing headers:

varnishtest "A test to reveal h2 buffer bug"

feature ignore_unknown_macro

server s1 {
    rxpri
    
    stream 0 {
        txsettings
        rxsettings
        txsettings -ack
        rxsettings
        expect settings.ack == true
    } -run
    
    stream 1 {
        rxreq       
        txresp -nostrend
        txdata -datalen 16384 -nostrend
        txresp -noadd -litHdr inc plain "grpc-status" plain "0"
    } -run
} -start

haproxy h1 -conf {
    global
        log stderr local0 debug

    defaults
        log global
        mode http
        option httplog
        maxconn 1000
        timeout connect 5s
        timeout client  5s
        timeout server  5s

    frontend fe
        bind "fd@${fe}" proto h2
        default_backend be_servers

    backend be_servers
        server srv1 ${s1_addr}:${s1_port} proto h2
} -start

client c1 -connect ${h1_fe_sock} {
    txpri
    
    stream 0 {
        txsettings
        rxsettings
        txsettings -ack
        rxsettings
        expect settings.ack == true
    } -run
    
    stream 1 {
        txreq -method POST -url "/"
        rxhdrs 
        expect resp.status == 200
        rxdata -some 2
        rxhdrs
        expect resp.http.grpc-status == 0
    } -run
} -run

Trailing header is missing, but there is DATA frame with size 0 and END_STREAM flag

***  c1    rx: stream: 1, type: DATA (0), flags: 0x00, size: 15256
***  c1    rx: stream: 1, type: DATA (0), flags: 0x00, size: 1072
**   c1.1  === rxhdrs
***  c1    rx: stream: 1, type: DATA (0), flags: 0x01, size: 0
***  c1    flag: END_STREAM
**** c1.1  s1 - no data

nikola-sh avatar Dec 14 '23 12:12 nikola-sh

Just to be sure, it is an issue with your vtest script or something you observed with regular clients and servers on your platform ?

But it is not related with the trailers and my fix. What happens here is that the server closes quickly the connection after sending the response and HAProxy detects a connection error trying to send a WINDOW_UPDATE frame to the server. In this case, the response processing is aborted. I don't know if it is expected. The last 56 bytes for the payload and the trailers was received and wait to be parsed. I must check with Willy. However it is not a 2.9 regression.

Adding a barrier to sync the client and the server solves the issue with your vtest. And you should wait for 3 DATA frames on client side:

varnishtest "A test to reveal h2 buffer bug"

feature ignore_unknown_macro

barrier b1 cond 2 -cyclic

server s1 {
    rxpri

    stream 0 {
        txsettings
        rxsettings
        txsettings -ack
        rxsettings
        expect settings.ack == true
    } -run

    stream 1 {
        rxreq
        txresp -nostrend
        txdata -datalen 16384 -nostrend
        txresp -noadd -litHdr inc plain "grpc-status" plain "0"
    } -run
    barrier b1 sync
} -start

haproxy h1 -conf {
    global
        log stderr local0 debug
    defaults
        log global
        mode http
        option httplog
        maxconn 1000
        timeout connect 5s
        timeout client  5s
        timeout server  5s

    frontend fe
        bind "fd@${fe}" proto h2
        default_backend be_servers

    backend be_servers
        server srv1 ${s1_addr}:${s1_port} proto h2
} -start

client c1 -connect ${h1_fe_sock} {
    txpri

    stream 0 {
        txsettings
        rxsettings
        txsettings -ack
        rxsettings
        expect settings.ack == true
    } -run

    stream 1 {
        txreq -method POST -url "/"
        rxhdrs
        expect resp.status == 200
        rxdata -some 3
        rxhdrs
        expect resp.http.grpc-status == 0
    } -run
    barrier b1 sync
} -run

capflam avatar Dec 14 '23 13:12 capflam

Just to be sure, it is an issue with your vtest script or something you observed with regular clients and servers on your platform ?

There are no real incidents in production.

I must check with Willy

Should I create separate github issue for this problem?

nikola-sh avatar Dec 14 '23 13:12 nikola-sh

Should I create separate github issue for this problem?

Well, for now, no. Just wait Willy's answer on the subject. I will let you know. Thanks again. In the mean time, I will still emit a 2.9.1. It is a quite old problem, no reason to delay the release, especially because it may be quite hard to fix.

capflam avatar Dec 14 '23 13:12 capflam

I merged a series that should fix your last issue. It will not be backported immediately to be sure they don't introduced regressions. And probably only to 2.9.

capflam avatar Dec 19 '23 06:12 capflam