cloudflared icon indicating copy to clipboard operation
cloudflared copied to clipboard

🐛 QUIC-run tunnel hanging requests randomly

Open morpig opened this issue 1 year ago • 7 comments

Running cloudflared version 2023.8.2 (built 2023-08-31-1506 UTC) both client & server.

Tunnel ID: a9ee974a-f873-44c0-9b7a-2fb276b0d775 Tunnel using QUIC, no firewalls/limiters/DPIs. UDP buffer is extended w/ sysctl.

Cloudflared act as tcp proxy, requests would suddenly hang/timeout.

Do not know how to reproduce as it's happening randomly.

If I hit the tunnel directly (without cloudflared client), hanging response. Expected response is 200 OK blank response:

root@jkt02-ctn:~# curl https://<tunneldomain>-v
*   Trying 2606:4700::6810:c723:443...
* TCP_NODELAY set
* Connected to <tunneldomain>(2606:4700::6810:c723) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  start date: Sep  8 08:32:36 2023 GMT
*  expire date: Dec  7 08:32:35 2023 GMT
*  issuer: C=US; O=Google Trust Services LLC; CN=GTS CA 1P5
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x558354553300)
> GET / HTTP/2
> Host: <tunneldomain>
> user-agent: curl/7.68.0
> accept: */*
> 
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!

No error from server side (current date 14th):

2023-09-11T23:24:42Z INF Unregistered tunnel connection connIndex=0 event=0 ip=2606:4700:a0::100
2023-09-11T23:24:42Z WRN Failed to serve quic connection error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=0 event=0 ip=2606:4700:a0::100
2023-09-11T23:24:42Z WRN Serve tunnel error error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=0 event=0 ip=2606:4700:a0::100
2023-09-11T23:24:42Z INF Retrying connection in up to 1s connIndex=0 event=0 ip=2606:4700:a0::100
2023-09-11T23:24:44Z WRN Connection terminated error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=0
2023-09-11T23:25:00Z INF Registered tunnel connection connIndex=0 connection=68299612-c33d-48a8-83a4-6336b7291034 event=0 ip=2606:4700:a0::100 location=fra11 protocol=quic
2023-09-12T02:25:46Z INF Unregistered tunnel connection connIndex=0 event=0 ip=2606:4700:a0::100
2023-09-12T02:25:46Z WRN Failed to serve quic connection error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=0 event=0 ip=2606:4700:a0::100
2023-09-12T02:25:46Z WRN Serve tunnel error error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=0 event=0 ip=2606:4700:a0::100
2023-09-12T02:25:46Z INF Retrying connection in up to 1s connIndex=0 event=0 ip=2606:4700:a0::100
2023-09-12T02:25:48Z WRN Connection terminated error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=0
2023-09-12T02:26:04Z INF Registered tunnel connection connIndex=0 connection=ae2d8e6f-8006-4f0f-bd44-d27dba090560 event=0 ip=2606:4700:a0::100 location=fra11 protocol=quic

morpig avatar Sep 14 '23 08:09 morpig

Expected curl response (running cloudflared tcp service):

> 
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
< HTTP/2 200 
< date: Thu, 14 Sep 2023 08:57:08 GMT
< content-length: 0
< cf-cache-status: DYNAMIC
< server: cloudflare
< cf-ray: 80676bae9c16357f-CGK
< 
* Connection #0 to host <tunneldomain> left intact```

morpig avatar Sep 14 '23 08:09 morpig

Same here. Just randomly hangs. The same version.

containerman17 avatar Oct 06 '23 17:10 containerman17

Same 😩

Xmonpl avatar Oct 10 '23 00:10 Xmonpl

That happens with default installation of k3s and microk8s for me. The same machine, the same network, but in Docker - works flawlessly. Ubuntu Server 22.04, all updated, latest version of cloudflare/cloudflared container. All setting are the same between docker and k8s.

containerman17 avatar Oct 10 '23 00:10 containerman17

+1 to this, I proxy a basic HTTP file server and I'm experiencing drops where the upload slowly goes from 10-20MB/S to 0MB/S, however this doesn't seem to happen on download. If I wait for 5-10 minutes it eventually goes to to 40-60MB/S when the file transfer resumes, before the tunnel drops I usually only get 10-20MB/S, this drop is happening using QUIC, the buffer has been increased to 2.5mb and the logs show nothing. I don't know if this issue is exactly related but I thought I would send it here before opening up a bug report. The website is still online/ I can refresh the website while the drop occurs.

sideloads avatar Oct 10 '23 23:10 sideloads

That happens with default installation of k3s and microk8s for me. The same machine, the same network, but in Docker - works flawlessly. Ubuntu Server 22.04, all updated, latest version of cloudflare/cloudflared container. All setting are the same between docker and k8s.

I can confirm that I face the same issue for k8s in latest version of cloudflare/cloudflared.

afalfallaj avatar Oct 25 '23 10:10 afalfallaj

Same issue here with the latest version 2024.1.2

Nenodema avatar Jan 16 '24 11:01 Nenodema