caddy
caddy copied to clipboard
HTTP/3 handshake timeout over IPv4 but not IPv6
1. Environment
1a. Operating system and version
FreeBSD 14.0-RELEASE-p4 amd64
1b. Caddy version
e1b9a9d7b08f6f0c21feb8edf122585891aa7099
v2.7.6 h1:w0NymbG2m9PcvKWsrXO6EEkY9Ru4FJK8uQbYcev1p3A=
v2.6.0 h1:lHDynvM+sTOi9Aq4Y15b4FtkqzPB36WbUrZvVdwzTCA=
2. Description
2a. What happens
Using v2.7.6 or e1b9a9d I can't connect with curl --http3-only --ipv4 (ERR_HANDSHAKE_TIMEOUT), whereas --ipv6 works as expected. HTTP/3 Check similarly works over IPv6, but not IPv4.
When I try v2.6.0, IPv4 starts working for both. Switching back, the results are as they were before.
Running tcpdump on the server, with no firewalls active, I see QUIC Initial packets but nothing more. There are entries in the console output indicating that they do reach Caddy.
2b. Why it's a bug
- There is no sign of Caddy sending anything in response.
- An older version doesn't exhibit this issue.
2c. Log output
In case of failure, repetitions (each with a unique id) of:
2024/01/30 05:10:15.515 DEBUG events event {"name": "tls_get_certificate", "id": "bdc8f332-bd5b-4618-b020-1b4ef0329ee2", "origin": "tls", "data": {"client_hello":{"CipherSuites":[4865,4866,4867,4868,255],"ServerName":"example.com","SupportedCurves":[23,29,24,25],"SupportedPoints":"AAEC","SignatureSchemes":[1027,1283,1539,2055,2056,2057,2058,2059,2052,2053,2054,1025,1281,1537],"SupportedProtos":["h3","h3-29"],"SupportedVersions":[772],"RemoteAddr":{"IP":"<client_ipv4>","Port":57654,"Zone":""},"LocalAddr":{"IP":"<server_ipv4>","Port":443,"Zone":""}}}}
2024/01/30 05:10:15.515 DEBUG tls.handshake choosing certificate {"identifier": "example.com", "num_choices": 1}
2024/01/30 05:10:15.515 DEBUG tls.handshake default certificate selection results {"identifier": "example.com", "subjects": ["example.com"], "managed": true, "issuer_key": "acme-v02.api.letsencrypt.org-directory", "hash": "6a9f361921bc7399b6d327dd0377dab3ba549c30c057384bcbcf4403ad326ef2"}
2024/01/30 05:10:15.515 DEBUG tls.handshake matched certificate in cache {"remote_ip": "<client_ipv4>", "remote_port": "57654", "subjects": ["example.com"], "managed": true, "expiration": "2024/04/28 02:54:34.000", "hash": "6a9f361921bc7399b6d327dd0377dab3ba549c30c057384bcbcf4403ad326ef2"}
In case of success:
2024/01/30 05:10:50.107 DEBUG events event {"name": "tls_get_certificate", "id": "e7b9a95f-b3ab-4c2b-82b3-72dbb1aca88e", "origin": "tls", "data": {"client_hello":{"CipherSuites":[4865,4866,4867,4868,255],"ServerName":"example.com","SupportedCurves":[23,29,24,25],"SupportedPoints":"AAEC","SignatureSchemes":[1027,1283,1539,2055,2056,2057,2058,2059,2052,2053,2054,1025,1281,1537],"SupportedProtos":["h3","h3-29"],"SupportedVersions":[772],"RemoteAddr":{"IP":"<client_ipv6>","Port":52258,"Zone":""},"LocalAddr":{"IP":"<server_ipv6>","Port":443,"Zone":""}}}}
2024/01/30 05:10:50.107 DEBUG tls.handshake choosing certificate {"identifier": "example.com", "num_choices": 1}
2024/01/30 05:10:50.107 DEBUG tls.handshake default certificate selection results {"identifier": "example.com", "subjects": ["example.com"], "managed": true, "issuer_key": "acme-v02.api.letsencrypt.org-directory", "hash": "6a9f361921bc7399b6d327dd0377dab3ba549c30c057384bcbcf4403ad326ef2"}
2024/01/30 05:10:50.107 DEBUG tls.handshake matched certificate in cache {"remote_ip": "<client_ipv6>", "remote_port": "52258", "subjects": ["example.com"], "managed": true, "expiration": "2024/04/28 02:54:34.000", "hash": "6a9f361921bc7399b6d327dd0377dab3ba549c30c057384bcbcf4403ad326ef2"}
2024/01/30 05:10:50.125 DEBUG http.handlers.reverse_proxy selected upstream {"dial": "localhost:8889", "total_upstreams": 1}
2024/01/30 05:10:50.128 DEBUG http.handlers.reverse_proxy upstream roundtrip {"upstream": "localhost:8889", "duration": 0.002274807, "request": {"remote_ip": "<client_ipv6>", "remote_port": "52258", "client_ip": "<client_ipv6>", "proto": "HTTP/3.0", "method": "HEAD", "host": "example.com", "uri": "/", "headers": {"User-Agent": ["curl/8.6.0-DEV"], "Accept": ["*/*"], "X-Forwarded-For": ["<client_ipv6>"], "X-Forwarded-Proto": ["https"], "X-Forwarded-Host": ["example.com"]}, "tls": {"resumed": false, "version": 772, "cipher_suite": 4865, "proto": "h3", "server_name": "example.com"}}, "headers": {"Server": ["Backend"], "Content-Type": ["text/html; charset=UTF-8"], "Date": ["Mon, 30 Jan 2024 05:10:50 GMT"], "Content-Length": ["87"]}, "status": 405}
Update: I bisected it down to commit 710824c3ce9f8084517e8ab099d57f9060f62061.
/cc @WeidiDeng
Thanks for narrowing that down!
Well, thanks for the easy build process. 👌
I was able to replicate this on a clean Parallels VM (now aarch64). Here are two interesting finds:
- It does not seem to be some conflict between IPv4 and IPv6; having IPv6 completely disabled, it still doesn't work.
- When explicitly binding to at least the used IPv4 address, and optionally more, it does work!
Hence my workaround in production is adding default_bind 127.0.0.1 [::1] <server_ipv4> <server_ipv6>.
Also confirmed on FreeBSD 13.2, so we can't blame it on 14.0 (which is fairly new).
Actually, that patch is no longer present in the latest version (udp sockets are reused using SO_REUSEADDR on unix).
That patch does enable more aggressive optimization from quic-go. I guess there is a bug from there.
Can you try using quic-go directly? Try passing a *net.UDPConn directly and wrapping it as a generic net.PacketConn without any more interfaces. I don't own a FreeBSD machine, so I can't debug it furthur.
Since I have zero experience with either Go or (implementing) QUIC, building a custom server isn't a trivial task. Perhaps their example requires minimal changes to test something.
On the other hand, if you need access to a FreeBSD box, that would be less of a problem to provide – it's just a question of where to send the credentials...? It doesn't need a dedicated box however; it's easy to run in a VM.
The problem is, I don't have access to VMs right now. You can send a temporary credentials to my email, or running ttyd with -t enableTrzsz=true with a limited privileged user so I can upload test quic-go files if you don't mind.
OK, I sent you an email to work out a testing environment.
You two are awesome -- thank you for looking into this :pray: :blush:
I am now experiencing an unexpected consequence of my workaround. Connections to port 80 are refused rather than answered with HTTP 308. The same happens for both IPv4 and IPv6. Is this a misunderstanding of default_bind on my part, or should this be considered a separate issue?
Just adding that I'm experiencing this issue as well.
- OS: Ubuntu 22.04.3 LTS, kernel 5.4.0
- CPU: AMD EPYC 7302P
- Hypervisor: OpenVZ
- Caddy version: v2.7.6 h1:w0NymbG2m9PcvKWsrXO6EEkY9Ru4FJK8uQbYcev1p3A=
Fixed by https://github.com/caddyserver/caddy/pull/6176.