cloudflared icon indicating copy to clipboard operation
cloudflared copied to clipboard

users reporting tunnel failures

Open kjpgit opened this issue 3 years ago • 34 comments

Users seeing the "cloudflare gateway error" page, intermittently today, when using a tunnel to a web site on AWS.

I then allowed UDP outbound from the ec2 instance, but still seeing a lot of errors in the logs:

Apr 26 11:06:50 jenkins-sys.myco.com cloudflared[978]: If you are using private routing to this Tunnel, then UDP (and Private DNS Resolution) will not workunless your cloudflared can connect with Cloudflare Network with quic. connIndex=1 Apr 26 11:06:50 jenkins-sys.myco.com cloudflared[978]: 2022-04-26T11:06:50Z INF Switching to fallback protocol http2 connIndex=1 Apr 26 11:06:51 jenkins-sys.myco.com cloudflared[978]: 2022-04-26T11:06:51Z INF Connection d2838bf9-b03d-4fcd-af18-1b62d2372abe registered connIndex=1 location=LAX

(restarting because I allowed outbound UDP)

Apr 26 13:19:06 jenkins-sys.myco.com systemd[1]: Stopping cloudflared... Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[978]: 2022-04-26T13:19:06Z INF Initiating graceful shutdown due to signal terminated ... Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[978]: 2022-04-26T13:19:06Z INF Unregistered tunnel connection connIndex=0 Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[978]: 2022-04-26T13:19:06Z INF Unregistered tunnel connection connIndex=2 Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[978]: 2022-04-26T13:19:06Z INF Unregistered tunnel connection connIndex=1 Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[978]: 2022-04-26T13:19:06Z INF Unregistered tunnel connection connIndex=3 Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[978]: 2022-04-26T13:19:06Z INF Tunnel server stopped Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[978]: 2022-04-26T13:19:06Z INF Metrics server stopped Apr 26 13:19:06 jenkins-sys.myco.com systemd[1]: Stopped cloudflared. Apr 26 13:19:06 jenkins-sys.myco.com systemd[1]: Starting cloudflared... Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:06Z INF Starting tunnel tunnelID=dd3d43a3-2923-40b9-ab88-9d538025aad9 Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:06Z INF Cannot determine default configuration path. No file [config.yml config.yaml] in [~/.cloudflared ~/.cloudflare-warp ~/cloudflare-warp /etc/cloudflared /usr/local/etc/cloudflared] Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:06Z INF Version 2022.4.1 Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:06Z INF GOOS: linux, GOVersion: go1.17.5, GoArch: amd64 Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:06Z INF Settings: map[no-autoupdate:true token:*****] Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:06Z INF Generated Connector ID: 04a10e73-ad90-498e-8d97-dfc9ea609874 Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:06Z INF Will be fetching remotely managed configuration from Cloudflare API. Defaulting to protocol: quic Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:06Z INF cloudflared will not automatically update if installed by a package manager. Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:06Z INF Initial protocol quic Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:06Z INF Starting metrics server on 127.0.0.1:33629/metrics Apr 26 13:19:06 jenkins-sys.myco.com cloudflared[29131]: 2022/04/26 13:19:06 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details. Apr 26 13:19:07 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:07Z INF Connection 9726d9a3-17ee-422e-b583-fd0464996e80 registered connIndex=0 location=PDX Apr 26 13:19:07 jenkins-sys.myco.com systemd[1]: Started cloudflared. Apr 26 13:19:07 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:07Z INF Updated to new configuration config="{"ingress":[{"hostname":"jenkins-sys.xxx.com", "originRequest":{"httpHostHeader":"", "noTLSVerify":true}, "service":"http://localhost:8080"}, {"service":"http_status:404"}], "warp-routing":{"enabled":false}}" version=2 Apr 26 13:19:07 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:07Z INF Connection 53ee58f4-6af9-4960-925d-b6524d37c5af registered connIndex=1 location=DEN Apr 26 13:19:08 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:08Z WRN Failed to serve quic connection error="already connected to this server, trying another address" connIndex=2 Apr 26 13:19:08 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:08Z WRN Unable to establish connection. error="already connected to this server, trying another address" connIndex=2 Apr 26 13:19:09 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:09Z INF Connection 2a27ea7d-dbb8-4a06-855f-fda17fbf1520 registered connIndex=3 location=LAX Apr 26 13:19:10 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:10Z WRN Connection terminated error="already connected to this server, trying another address" connIndex=2 Apr 26 13:19:19 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:19:19Z INF Connection 4a5c2467-d904-48f5-95cf-6cb7132b6191 registered connIndex=2 location=PDX Apr 26 13:36:43 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:36:43Z ERR error="Unable to reach the origin service. The service may be down or it may not be responding to traffic from cloudflared: EOF" cfRay=701fb5816b27a3b5-MRS ingressRule=0 originService=http://localhost:8080 Apr 26 13:36:43 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:36:43Z ERR Failed to handle QUIC stream error="Unable to reach the origin service. The service may be down or it may not be responding to traffic from cloudflared: EOF" connIndex=2 Apr 26 13:57:21 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:57:21Z INF Unregistered tunnel connection connIndex=1 Apr 26 13:57:21 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:57:21Z WRN Failed to serve quic connection error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=1 Apr 26 13:57:21 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:57:21Z WRN Serve tunnel error error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=1 Apr 26 13:57:21 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:57:21Z INF Retrying connection in up to 1s seconds connIndex=1 Apr 26 13:57:23 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:57:23Z WRN If this log occurs persistently, and cloudflared is unable to connect to Cloudflare Network with quic protocol, then most likely your machine/network is getting its egress UDP to port 7844 (or others) blocked or dropped. Make sure to allow egress connectivity as per https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/configuration/ports-and-ips/ Apr 26 13:57:23 jenkins-sys.myco.com cloudflared[29131]: If you are using private routing to this Tunnel, then UDP (and Private DNS Resolution) will not workunless your cloudflared can connect with Cloudflare Network with quic. connIndex=1 Apr 26 13:57:23 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:57:23Z INF Switching to fallback protocol http2 connIndex=1 Apr 26 13:57:23 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T13:57:23Z INF Connection 5c313303-b31f-4cbe-96d7-3c111c8ac630 registered connIndex=1 location=DEN Apr 26 14:21:59 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T14:21:59Z ERR error="Unable to reach the origin service. The service may be down or it may not be responding to traffic from cloudflared: EOF" cfRay=701ff7ce7840a3af-MRS ingressRule=0 originService=http://localhost:8080 Apr 26 14:21:59 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T14:21:59Z ERR Failed to handle QUIC stream error="Unable to reach the origin service. The service may be down or it may not be responding to traffic from cloudflared: EOF" connIndex=2 Apr 26 15:11:17 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T15:11:17Z INF Lost connection with the edge connIndex=1 Apr 26 15:11:17 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T15:11:17Z WRN Serve tunnel error error="connection with edge closed" connIndex=1 Apr 26 15:11:17 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T15:11:17Z INF Retrying connection in up to 1s seconds connIndex=1 Apr 26 15:11:17 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T15:11:17Z INF Unregistered tunnel connection connIndex=1 Apr 26 15:11:18 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T15:11:18Z INF Changing protocol to quic connIndex=1 Apr 26 15:11:18 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T15:11:18Z INF Connection 56c8588c-f500-488c-a68a-7963d0afcb64 registered connIndex=1 location=DEN Apr 26 15:43:19 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T15:43:19Z INF Unregistered tunnel connection connIndex=1 Apr 26 15:43:19 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T15:43:19Z WRN Failed to serve quic connection error="timeout: no recent network activity" connIndex=1 Apr 26 15:43:19 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T15:43:19Z WRN Serve tunnel error error="timeout: no recent network activity" connIndex=1 Apr 26 15:43:19 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T15:43:19Z INF Retrying connection in up to 1s seconds connIndex=1 Apr 26 15:43:20 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T15:43:20Z WRN If this log occurs persistently, and cloudflared is unable to connect to Cloudflare Network with quic protocol, then most likely your machine/network is getting its egress UDP to port 7844 (or others) blocked or dropped. Make sure to allow egress connectivity as per https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/configuration/ports-and-ips/ Apr 26 15:43:20 jenkins-sys.myco.com cloudflared[29131]: If you are using private routing to this Tunnel, then UDP (and Private DNS Resolution) will not workunless your cloudflared can connect with Cloudflare Network with quic. connIndex=1 Apr 26 15:43:20 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T15:43:20Z INF Switching to fallback protocol http2 connIndex=1 Apr 26 15:43:20 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T15:43:20Z INF Connection ddfe37d1-bc11-4ccb-aef3-8e9444d3e010 registered connIndex=1 location=LAX Apr 26 16:26:18 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:26:18Z INF Lost connection with the edge connIndex=1 Apr 26 16:26:18 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:26:18Z WRN Serve tunnel error error="connection with edge closed" connIndex=1 Apr 26 16:26:18 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:26:18Z INF Retrying connection in up to 1s seconds connIndex=1 Apr 26 16:26:18 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:26:18Z INF Unregistered tunnel connection connIndex=1 Apr 26 16:26:20 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:26:20Z INF Changing protocol to quic connIndex=1 Apr 26 16:26:21 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:26:21Z INF Connection 9c341b63-0be9-472b-8391-fa2f0db0b457 registered connIndex=1 location=DEN Apr 26 16:27:51 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:27:51Z INF Unregistered tunnel connection connIndex=1 Apr 26 16:27:51 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:27:51Z WRN Failed to serve quic connection error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=1 Apr 26 16:27:51 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:27:51Z WRN Serve tunnel error error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=1 Apr 26 16:27:51 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:27:51Z INF Retrying connection in up to 1s seconds connIndex=1 Apr 26 16:27:51 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:27:51Z WRN If this log occurs persistently, and cloudflared is unable to connect to Cloudflare Network with quic protocol, then most likely your machine/network is getting its egress UDP to port 7844 (or others) blocked or dropped. Make sure to allow egress connectivity as per https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/configuration/ports-and-ips/ Apr 26 16:27:51 jenkins-sys.myco.com cloudflared[29131]: If you are using private routing to this Tunnel, then UDP (and Private DNS Resolution) will not workunless your cloudflared can connect with Cloudflare Network with quic. connIndex=1 Apr 26 16:27:51 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:27:51Z INF Switching to fallback protocol http2 connIndex=1 Apr 26 16:27:51 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:27:51Z INF Connection 2f9a037f-a5d7-4894-9eb5-3a3684c7531a registered connIndex=1 location=DEN Apr 26 16:28:31 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:28:31Z INF Lost connection with the edge connIndex=1 Apr 26 16:28:31 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:28:31Z WRN Serve tunnel error error="connection with edge closed" connIndex=1 Apr 26 16:28:31 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:28:31Z INF Retrying connection in up to 1s seconds connIndex=1 Apr 26 16:28:31 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:28:31Z INF Unregistered tunnel connection connIndex=1 Apr 26 16:28:33 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:28:33Z INF Changing protocol to quic connIndex=1 Apr 26 16:28:33 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:28:33Z INF Connection 429c997a-e7b9-4959-8369-d08abc8e790d registered connIndex=1 location=LAX Apr 26 16:31:17 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:31:17Z INF Unregistered tunnel connection connIndex=3 Apr 26 16:31:17 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:31:17Z WRN Failed to serve quic connection error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=3 Apr 26 16:31:17 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:31:17Z WRN Serve tunnel error error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=3 Apr 26 16:31:17 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:31:17Z INF Retrying connection in up to 1s seconds connIndex=3 Apr 26 16:31:19 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:31:19Z WRN If this log occurs persistently, and cloudflared is unable to connect to Cloudflare Network with quic protocol, then most likely your machine/network is getting its egress UDP to port 7844 (or others) blocked or dropped. Make sure to allow egress connectivity as per https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/configuration/ports-and-ips/ Apr 26 16:31:19 jenkins-sys.myco.com cloudflared[29131]: If you are using private routing to this Tunnel, then UDP (and Private DNS Resolution) will not workunless your cloudflared can connect with Cloudflare Network with quic. connIndex=3 Apr 26 16:31:19 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:31:19Z INF Switching to fallback protocol http2 connIndex=3 Apr 26 16:31:19 jenkins-sys.myco.com cloudflared[29131]: 2022-04-26T16:31:19Z INF Connection 2853a71e-8367-4a55-b679-5196e412d4b9 registered connIndex=3 location=DEN

kjpgit avatar Apr 26 '22 16:04 kjpgit

I'm not sure why the 'can't reach origin service' is shown, it's going to http://localhost and that java service has been up for 10 days+.

kjpgit avatar Apr 26 '22 16:04 kjpgit

I do get similar issue. The tunnel is really unstable and internal user are constantly disconnected which affect their day to day work.

Here is a debug log extract with version 2022.4.1.

{"level":"debug","time":"2022-04-26T15:42:15Z","message":"CF-RAY: 70206d637a0d7fa4-ORD Request content length 0"}
{"level":"debug","time":"2022-04-26T15:57:50Z","message":"rpcconnect: rx (abort = (reason = \"rpc: shutdown\", type = failed, obsoleteIsCallersFault = false, obsoleteDurability = 0))"}
{"level":"info","time":"2022-04-26T15:57:50Z","message":"abort: rpc: aborted by remote: rpc: shutdown"}
{"level":"debug","time":"2022-04-26T15:57:50Z","message":"rpcconnect: rx error: context canceled"}
{"level":"info","connIndex":1,"time":"2022-04-26T15:57:50Z","message":"Unregistered tunnel connection"}
{"level":"warn","connIndex":1,"error":"failed to accept QUIC stream: Application error 0x0","time":"2022-04-26T15:57:50Z","message":"Failed to serve quic connection"}
{"level":"warn","connIndex":1,"error":"failed to accept QUIC stream: Application error 0x0","time":"2022-04-26T15:57:50Z","message":"Serve tunnel error"}
{"level":"info","connIndex":1,"time":"2022-04-26T15:57:50Z","message":"Retrying connection in up to 1s seconds"}
{"level":"debug","time":"2022-04-26T15:57:50Z","message":"rpcconnect: tx (bootstrap = (questionId = 0, deprecatedObjectId = <opaque pointer>))"}
{"level":"debug","time":"2022-04-26T15:57:50Z","message":"rpcconnect: tx (call = (questionId = 1, target = (promisedAnswer = (questionId = 0, transform = [])), interfaceId = 17804583019846587543, methodId = 0, allowThirdPartyTailCall = false, params = (content = <opaque pointer>, capTable = []), sendResultsTo = (caller = void)))"}
{"level":"debug","time":"2022-04-26T15:57:50Z","message":"rpcconnect: rx (return = (answerId = 0, releaseParamCaps = false, results = (content = <opaque pointer>, capTable = [(senderHosted = 0)])))"}
{"level":"debug","time":"2022-04-26T15:57:50Z","message":"rpcconnect: tx (finish = (questionId = 0, releaseResultCaps = false))"}
{"level":"debug","time":"2022-04-26T15:57:51Z","message":"rpcconnect: rx (return = (answerId = 1, releaseParamCaps = false, results = (content = <opaque pointer>, capTable = [])))"}
{"level":"debug","time":"2022-04-26T15:57:51Z","message":"rpcconnect: tx (finish = (questionId = 1, releaseResultCaps = false))"}
{"level":"info","connIndex":1,"location":"YYZ","time":"2022-04-26T15:57:51Z","message":"Connection 3d26566f-b9ea-421e-9ade-3ae6cb8327d4 registered"}
{"level":"debug","time":"2022-04-26T15:57:58Z","message":"rpcconnect: rx (abort = (reason = \"rpc: shutdown\", type = failed, obsoleteIsCallersFault = false, obsoleteDurability = 0))"}
{"level":"info","time":"2022-04-26T15:57:58Z","message":"abort: rpc: aborted by remote: rpc: shutdown"}
{"level":"debug","time":"2022-04-26T15:57:58Z","message":"rpcconnect: rx error: context canceled"}
{"level":"info","connIndex":3,"time":"2022-04-26T15:57:58Z","message":"Unregistered tunnel connection"}
{"level":"warn","connIndex":3,"error":"failed to accept QUIC stream: Application error 0x0","time":"2022-04-26T15:57:58Z","message":"Failed to serve quic connection"}
{"level":"warn","connIndex":3,"error":"failed to accept QUIC stream: Application error 0x0","time":"2022-04-26T15:57:58Z","message":"Serve tunnel error"}
{"level":"info","connIndex":3,"time":"2022-04-26T15:57:58Z","message":"Retrying connection in up to 1s seconds"}
{"level":"debug","time":"2022-04-26T15:57:59Z","message":"rpcconnect: tx (bootstrap = (questionId = 0, deprecatedObjectId = <opaque pointer>))"}
{"level":"debug","time":"2022-04-26T15:57:59Z","message":"rpcconnect: tx (call = (questionId = 1, target = (promisedAnswer = (questionId = 0, transform = [])), interfaceId = 17804583019846587543, methodId = 0, allowThirdPartyTailCall = false, params = (content = <opaque pointer>, capTable = []), sendResultsTo = (caller = void)))"}
{"level":"debug","time":"2022-04-26T15:57:59Z","message":"rpcconnect: rx (return = (answerId = 0, releaseParamCaps = false, results = (content = <opaque pointer>, capTable = [(senderHosted = 0)])))"}
{"level":"debug","time":"2022-04-26T15:57:59Z","message":"rpcconnect: tx (finish = (questionId = 0, releaseResultCaps = false))"}
{"level":"debug","time":"2022-04-26T15:57:59Z","message":"rpcconnect: rx (return = (answerId = 1, releaseParamCaps = false, results = (content = <opaque pointer>, capTable = [])))"}
{"level":"debug","time":"2022-04-26T15:57:59Z","message":"rpcconnect: tx (finish = (questionId = 1, releaseResultCaps = false))"}
{"level":"info","connIndex":3,"location":"YYZ","time":"2022-04-26T15:57:59Z","message":"Connection 3e3bead0-cc4a-45dc-bc91-527190efee66 registered"}
{"level":"debug","time":"2022-04-26T16:00:29Z","message":"rpcconnect: rx error: Application error 0x0"}
{"level":"info","connIndex":3,"time":"2022-04-26T16:00:29Z","message":"Unregistered tunnel connection"}
{"level":"warn","connIndex":3,"error":"failed to accept QUIC stream: Application error 0x0","time":"2022-04-26T16:00:29Z","message":"Failed to serve quic connection"}
{"level":"warn","connIndex":3,"error":"failed to accept QUIC stream: Application error 0x0","time":"2022-04-26T16:00:29Z","message":"Serve tunnel error"}
{"level":"info","connIndex":3,"time":"2022-04-26T16:00:29Z","message":"Retrying connection in up to 1s seconds"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: rx (abort = (reason = \"rpc: shutdown\", type = failed, obsoleteIsCallersFault = false, obsoleteDurability = 0))"}
{"level":"info","time":"2022-04-26T16:00:31Z","message":"abort: rpc: aborted by remote: rpc: shutdown"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: rx error: context canceled"}
{"level":"info","connIndex":1,"time":"2022-04-26T16:00:31Z","message":"Unregistered tunnel connection"}
{"level":"warn","connIndex":1,"error":"failed to accept QUIC stream: Application error 0x0","time":"2022-04-26T16:00:31Z","message":"Failed to serve quic connection"}
{"level":"warn","connIndex":1,"error":"failed to accept QUIC stream: Application error 0x0","time":"2022-04-26T16:00:31Z","message":"Serve tunnel error"}
{"level":"info","connIndex":1,"time":"2022-04-26T16:00:31Z","message":"Retrying connection in up to 1s seconds"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: tx (bootstrap = (questionId = 0, deprecatedObjectId = <opaque pointer>))"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: tx (call = (questionId = 1, target = (promisedAnswer = (questionId = 0, transform = [])), interfaceId = 17804583019846587543, methodId = 0, allowThirdPartyTailCall = false, params = (content = <opaque pointer>, capTable = []), sendResultsTo = (caller = void)))"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: rx (return = (answerId = 0, releaseParamCaps = false, results = (content = <opaque pointer>, capTable = [(senderHosted = 0)])))"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: tx (finish = (questionId = 0, releaseResultCaps = false))"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: tx (bootstrap = (questionId = 0, deprecatedObjectId = <opaque pointer>))"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: tx (call = (questionId = 1, target = (promisedAnswer = (questionId = 0, transform = [])), interfaceId = 17804583019846587543, methodId = 0, allowThirdPartyTailCall = false, params = (content = <opaque pointer>, capTable = []), sendResultsTo = (caller = void)))"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: rx (return = (answerId = 0, releaseParamCaps = false, results = (content = <opaque pointer>, capTable = [(senderHosted = 0)])))"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: tx (finish = (questionId = 0, releaseResultCaps = false))"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: rx (return = (answerId = 1, releaseParamCaps = false, results = (content = <opaque pointer>, capTable = [])))"}
{"level":"info","connIndex":1,"location":"YYZ","time":"2022-04-26T16:00:31Z","message":"Connection d662278d-40b3-42b8-a3dc-53d6c2442f8d registered"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: tx (finish = (questionId = 1, releaseResultCaps = false))"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: rx (return = (answerId = 1, releaseParamCaps = false, results = (content = <opaque pointer>, capTable = [])))"}
{"level":"info","connIndex":3,"location":"YYZ","time":"2022-04-26T16:00:31Z","message":"Connection 02db7890-08c1-416b-b39a-3e154ade48a1 registered"}
{"level":"debug","time":"2022-04-26T16:00:31Z","message":"rpcconnect: tx (finish = (questionId = 1, releaseResultCaps = false))"}

The tunnel process PID also change from the moment it started and the moment user got disconnected. It feels like cloudflared process restarted which caused the connection drop out

mbelang avatar Apr 26 '22 17:04 mbelang

We're seeing the same issue:

2022-04-28T10:51:46Z INF Starting tunnel tunnelID=f8cafe4b-b9c1-4e79-8bb7-6055cc35fd1c
2022-04-28T10:51:46Z INF Cannot determine default configuration path. No file [config.yml config.yaml] in [~/.cloudflared ~/.cloudflare-warp ~/cloudflare-warp /etc/cloudflared /usr/local/etc/cloudflared]
2022-04-28T10:51:46Z INF Version 2022.4.1
2022-04-28T10:51:46Z INF GOOS: linux, GOVersion: go1.17.1, GoArch: amd64
2022-04-28T10:51:46Z INF Settings: map[f:true force:true]
2022-04-28T10:51:46Z INF Environmental variables map[REDACTED]
2022-04-28T10:51:46Z INF Generated Connector ID: da28823b-f157-4f52-b05e-e7c1b2941dc2
2022-04-28T10:51:46Z INF Initial protocol quic
2022-04-28T10:51:46Z INF Starting metrics server on [::]:3333/metrics
2022-04-28T10:51:47Z INF Connection 732d6e35-b2bf-4aa1-a7a5-b1132ac7e643 registered connIndex=0 location=ORD
2022-04-28T10:51:47Z INF Connection cd583c29-2642-4186-92e5-204ff9b06609 registered connIndex=1 location=DEN
2022-04-28T10:51:48Z INF Connection 889b2b5a-a75c-47f6-ad8e-5b0e974e7741 registered connIndex=2 location=ORD
2022-04-28T10:51:49Z INF Connection 2141a1a8-39e4-4d1c-ae03-3253472f05ce registered connIndex=3 location=DEN
2022-04-28T10:52:04Z INF Unregistered tunnel connection connIndex=3
2022-04-28T10:52:04Z WRN Failed to serve quic connection error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=3
2022-04-28T10:52:04Z WRN Serve tunnel error error="failed to accept QUIC stream: timeout: no recent network activity" connIndex=3
2022-04-28T10:52:04Z INF Retrying connection in up to 1s seconds connIndex=3
2022-04-28T10:52:04Z WRN If this log occurs persistently, and cloudflared is unable to connect to Cloudflare Network with `quic` protocol, then most likely your machine/network is getting its egress UDP to port 7844 (or others) blocked or dropped. Make sure to allow egress connectivity as per https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/configuration/ports-and-ips/
2022-04-28T10:52:04Z WRN If you are using private routing to this Tunnel, then UDP (and Private DNS Resolution) will not workunless your cloudflared can connect with Cloudflare Network with `quic`. connIndex=3
2022-04-28T10:52:04Z INF Switching to fallback protocol http2 connIndex=3
2022-04-28T10:52:05Z INF Connection 9edb66cd-072c-44c3-9016-2def19510c7e registered connIndex=3 location=DEN

Interestingly it's only happening to one of the four connections, and the other connections are working fine.

HofmannZ avatar Apr 28 '22 11:04 HofmannZ

I had a very similar issue; see #617. The solution is to update your cloudflared to 2022.4.1, fixing the problem.

If you don't care about QUIC, stop here. If you insist on using QUIC protocol than http2, then check your UDP Tx/Rx communications at port 7844. If the UDP Tx/Rx communications at port 7844 are OK. Then, it is a Cloudflare tunnel edge server issue just like mine, and you have to wait for the Cloudflare team to fix it. If the UDP Tx/Rx at port 7844 is blocked, then it is your firewall or ISP issue. You just gave up and used http2 instead.

darth-pika-hu avatar Apr 28 '22 20:04 darth-pika-hu

It would sure be nice if there was a "one click" option with cloudflared to run this udp test. The logs I see just look like gibberish, I have no idea if quic is even working or not. I'm still seeing many "warnings" with 2022.4.1.

kjpgit avatar Apr 28 '22 21:04 kjpgit

It would sure be nice if there was a "one click" option with cloudflared to run this udp test. The logs I see just look like gibberish, I have no idea if quic is even working or not. I'm still seeing many "warnings" with 2022.4.1.

I don't work for Cloudflare. As long as you upgrade to 2022.4.1, the Cloudflare tunnel should work.

darth-pika-hu avatar Apr 28 '22 21:04 darth-pika-hu

darth, my initial bug report was for 2022.4.1, if you looked at the log... thanks for the link to the other issue though.

kjpgit avatar Apr 28 '22 21:04 kjpgit

We're also running version 2022.4.1 as shown in the logs above 👆

HofmannZ avatar Apr 29 '22 09:04 HofmannZ

Also having this issue with 2022.4.1.

Same errors as others reported (kubernetes 1.22.x. on OVH)

2022-05-01T03:08:46Z INF Will be fetching remotely managed configuration from Cloudflare API. Defaulting to protocol: quic
2022-05-01T03:08:46Z INF Initial protocol quic
2022-05-01T03:08:46Z INF Starting metrics server on 127.0.0.1:33739/metrics
2022/05/01 03:08:46 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
2022-05-01T03:08:47Z INF Connection dc515f8e-cccc-4509-8a52-572af71d24ea registered connIndex=0 location=AMS
2022-05-01T03:08:47Z INF Updated to new configuration config="{\"ingress\":[{\"hostname\":\"xxxxxx\",\"originRequest\":{},\"service\":\"http://zzzzz.svc.cluster.local:8081\"},{\"service\":\"http_status:404\"}],\"warp-routing\":{\"enabled\":false}}" version=5
2022-05-01T03:08:47Z INF Connection 23eb86f6-223e-4bc7-ba9b-93c1ad8f134a registered connIndex=1 location=CDG
2022-05-01T03:08:48Z INF Connection 32eb96c5-2f82-405a-9add-4c45f2879759 registered connIndex=2 location=AMS
2022-05-01T03:08:49Z INF Connection f62ac67d-0df4-4128-94da-a016f09dea7b registered connIndex=3 location=CDG
2022-05-01T03:09:04Z INF Initiating graceful shutdown due to signal terminated ...

I guess it would be preferable to fail gracefully from quic to http2 instead of terminating the pod, but maybe there is a good reason for this to be set that way.

joaocc avatar May 01 '22 03:05 joaocc

Similar issues here - OVH and similar providers however don't have very reliable UDP transmission, so even trying to use QUIC may cause issues like this inherently until they upgrade their filtering hardware to have stateful QUIC support, and I'm not sure if there's anything that could be done to fix this in cloudflared other than unreliable heuristics.

Forcing the tunnel to use the old HTTP/2 transport does seem to work fine for these providers.

blattersturm avatar May 02 '22 10:05 blattersturm

Hi. Also tried this on AWS with same settings: got the same error message (buffer size) but containerd didn't shutdown. Any idea on what might be causing this difference in behaviour? Thx

joaocc avatar May 02 '22 12:05 joaocc

Also having this and related issues, only suddenly after rebooting my server which automatically starts tunnels on boot. Connections drop for a few seconds and pick back up.

2022-05-02T19:40:34Z WRN Serve tunnel error error="connection with edge closed" connIndex=3
2022-05-02T19:40:34Z INF Retrying connection in up to 1s seconds connIndex=3
2022-05-02T19:40:34Z INF Changing protocol to quic connIndex=3
2022-05-02T19:40:35Z INF Connection b7e577ab-4a46-4003-a083-4de1cd730758 registered connIndex=3 location=IAD
2022-05-02T19:41:14Z INF Lost connection with the edge connIndex=1
2022-05-02T19:41:14Z WRN Serve tunnel error error="connection with edge closed" connIndex=1
2022-05-02T19:41:14Z INF Retrying connection in up to 1s seconds connIndex=1
2022-05-02T19:41:14Z INF Unregistered tunnel connection connIndex=1
2022-05-02T19:41:15Z INF Changing protocol to quic connIndex=1
2022-05-02T19:41:16Z INF Connection 65eb6307-114b-478a-ba5c-193607fb9068 registered connIndex=1 location=IAD

This issue persists after upgrading to 2022.4.1.

sulliops avatar May 02 '22 19:05 sulliops

Just FYI, 2022.5.0 is out

darth-pika-hu avatar May 03 '22 21:05 darth-pika-hu

Issue persists after upgrading to 2022.5.0. Doesn't seem to have anything to do with the actual client itself, rather the connection to the edge servers.

sulliops avatar May 03 '22 22:05 sulliops

I have a ticket open with Enterprise Support and here are some interesting information:

We will at times see disconnects when updates are happening at the edge however, we do them in rolling updates, and the 4x High Availability connections are there, in part, because of this practice.

If we're focusing on SSH here, and the disconnects seem to only impact ssh connections, we may be running into the 270 second timeout that is enforced by Gateway. More information about that can be found [Here - ​​Connections are timing out after 270 seconds](https://developers.cloudflare.com/cloudflare-one/faq/teams-troubleshooting/#connections-are-timing-out-after-270-seconds)

We must send keep alives from the client so that we can keep Gateway happy. This can be accomplished with the ServerAliveKeepAlive option in ssh (to be performed by the client machines).

ssh -o ServerAliveInterval=60 [[email protected]](mailto:[email protected])

I'm going to keep this thread up to date with any new information that I will get from them or from my tests.

mbelang avatar May 04 '22 12:05 mbelang

A few notes for everyone who's written in so far:

  • failed to sufficiently increase receive buffer size is not a problem per se; may impact performance, so ideally you'd change the operating system / environment settings where cloudflared runs to allow for a bigger buffer, but that is just an optimization, not a functional concern
  • all logs shared above contain some of the (4) cloudflared outbound (quic) connections disconnecting and reconnecting; if you notice carefully, all of those appear with logging level warn or WRN, which is not a problem per se again; as long as there is at least 1 active connection (among the 4), the Tunnel should be reachable by users; if no connections were active, those same logs would appear with error or ERR logging level
  • if your concern is only because those 4 connections are reconnecting, and you are seeing that in the logs, then that is not a problem --- reconnects are normal, due to many reasons, and that is why we have 4 connections outbound (so that the likelihood of all 4 being down at the same time is very low)

If this still does not soothe your concerns, then we'd need to understand better why not in each case:

  1. what is the tunnel ID having problems
  2. how exactly do you perceive those problems

nmldiegues avatar May 09 '22 10:05 nmldiegues

Hi. Thanks for the clarification. In our case, it was the fact that the tunnel was shutting down that was the initial concern. We can test with the others. Thx

joaocc avatar May 09 '22 12:05 joaocc

@nmldiegues . Our problem is the fact the the SSH connection drops out of the blue which prevents our tech force to be efficient. They have to reconnect to the tunnel while they were working with it and sometimes we have to completely reboot the VM because we cannot even connect to the tunnel anymore.

Update from my support ticket:

What I had discovered is that when the connection is established you're directly connected to a metal on the edge side, and will be connected over one of the specific 4x connections (which equate to 4x metals).

At this time, if either of the 2 metals restart (that your client is connected to, or the route that was taken of the 4x connections) your connection will be disconnected. In this scenario if you attempt to reconnect while we the 1x metal is restarting, you'll just take the other metal route at that location.

Additionally if there are any changes to Anycast IP addresses, and you're client side is redirected to a new metal, the TCP connection will also drop. \

I hope this helps clarify your scenario here.

If tunnel is meant to disconnect like this, it is not ideal at all.

mbelang avatar May 09 '22 12:05 mbelang

@joaocc do you have autoupdate enabled? The autoupdate feature doesn't work in container environment because it creates a new process.

chungthuang avatar May 09 '22 15:05 chungthuang

Autopilot was disabled. Anyway, i retryed enabling auto protocol (as per your mail above) and the service-shutdown issues seem to have stopped happening since upgrading to 2022.5.0.

joaocc avatar May 10 '22 07:05 joaocc

We are periodically (approx. once per day) seeing huge amounts of HTTP status code 0 being returned to Cloudflare's edge servers (according to the Cloudflare log files) Often this is combined with thousands of messages like this (but not always)

2022-06-02T15:52:01Z ERR error="Unable to reach the origin service. The service may be down or it may not be responding to traffic from cloudflared: context canceled" cfRay=71515a8e79447373-CPH ingressRule=0 originService=http://xxxx.yyyy.svc.cluster.local:80 2022-06-02T15:52:02Z ERR error="Unable to reach the origin service. The service may be down or it may not be responding to traffic from cloudflared: context canceled" cfRay=71515a9a1a25abde-CPH ingressRule=0 originService=http://xxxx.yyyy.svc.cluster.local:80 2022-06-02T15:53:11Z ERR error="Unable to reach the origin service. The service may be down or it may not be responding to traffic from cloudflared: context canceled" cfRay=71515c47ce71d427-CDG ingressRule=0 originService=http://xxxx.yyyy.svc.cluster.local:80

schack avatar Jun 02 '22 19:06 schack

We confirm we have the exact same issue. Everytime we update to the latest Cloudflared, and we still have the same issue as we speak.

It feels like when there is a peering issue in a location, traffic is not being re-routed on tunnel. It's our hypothesis.

baptistejamin avatar Jul 20 '22 10:07 baptistejamin

Still encountering this issue after updating to 2022.8.2. Connection drops every five minutes or so, then it takes about 5 minutes to come back online.

sulliops avatar Aug 18 '22 16:08 sulliops

I am also experiencing the same issue with 2022.8.2 ... connects for 5 mins.. down for a while, connects for a short period of time... repeat. The tunnel is unusable at this point..

Marcbacca avatar Aug 26 '22 14:08 Marcbacca

Same error here. Any updates?

dannykorpan avatar Sep 09 '22 07:09 dannykorpan

Same error at tag 2022.11.1

Kation avatar Dec 03 '22 13:12 Kation

Same problem here with a Hetzner Root Server.

dannykorpan avatar Dec 03 '22 14:12 dannykorpan

I was thrilled to discover this fantastic feature, but unfortunately, I'm disappointed to have invested a lot of time learning it without realizing how unreliable it is. Today is February 28, 2023, and the problem remains unresolved without any additional updates. I'm hoping that the community has found a solution to this issue.

Eric-TPS avatar Feb 28 '23 15:02 Eric-TPS

The tunnels have been running rock solid for me for the past 6-8 months or so, my issues were outside of Cloudflare's control.

schack avatar Feb 28 '23 15:02 schack

The tunnels have been running rock solid for me for the past 6-8 months or so, my issues were outside of Cloudflare's control.

Interesting, do you happening to be running them in Kubernetes?

Eric-TPS avatar Feb 28 '23 15:02 Eric-TPS