envoy
envoy copied to clipboard
Envoy does not adhere to HTTP/2 RFC 7540
Title: Envoy does not adhere to HTTP/2 RFC 7540
Description:
RFC 7540 Section 9.1.1 and 9.1.2 specifies when a request coming in through a re-used HTTP/2 connection is accidentally sent to a non-origin but authoritative server that a 421 response should be returned. This can happen if two servers one with a wildcard certificate (e.g., a.example.com) and another server (b.example.com) with a non-wildcard on the same IP address using SNI responds to requests those meant for server b.example.com will accidentally be forwarded down the re-used HTTP/2 connection for a.example.com. In this situation a.example.com should send back a 421 to indicate the request was destined for b.example.com. This forces browsers to re-establish a new connection, re-negotiate the SNI, and thus the backing server and subsequently route to the correct origin.
[optional Relevant Links:]
- https://tools.ietf.org/html/rfc7540#section-9.1.1 - section describing connection re-use
- https://tools.ietf.org/html/rfc7540#section-9.1.2 - section describing misdirected response
- https://bugs.chromium.org/p/chromium/issues/detail?id=954160#c5 - The bug was originally filed against Chromium however they indicated Istio was the issue.
- https://github.com/istio/istio/issues/13589 - The bug was then filed against istio who indicated envoy was the issue.
@alyssawilk @PiotrSikora thoughts on this? I haven't read the relevant RFCs in detail to fully understand what is needed.
Intersection of H2 specs and TLS handshake? I eagerly anticipate Piotr sorting this out :-P
The gist is that browsers coalesce HTTP/2 connections pretty aggressively: when a browser opens connection to www.example.com and is presented with a certificate for *.example.com during TLS handshake, then it will re-use this connection for all requests to *.example.com, as long as the hostname resolves to the same IP (and some browsers don't even care about that).
As long as all *.example.com hostnames are served by the same listener/filter chain, this shouldn't be an issue in Envoy, since routing is happening on a per-request, and not per-connection basis (please correct me if I'm wrong).
However, if www.example.com (with *.example.com certificate) is served by one listener/filter chain, and app.example.com is served by another listener/filter chain, then we have an issue, since connections are latched to a single listener/filter chain for the lifetime of the connection (again, please correct me if I'm wrong), and if the connection to www.example.com is established first, then requests to app.example.com will be coalesced on the same connection, using configuration for www.example.com, and forwarded to the wrong backend.
One solution would be to send 421 Misdirected Request response to requests for hostnames that are not configured on a given listener/filter chain (but this wouldn't work if *.example.com is configured), or send 421 Misdirected Request response to requests for hostnames that are configured on other listeners/filter chains (but this requires a global list of all configured hostnames).
Another solution would be using HTTP/2 ORIGIN frame (RFC8336) to advertise allowed hostnames on a given listener/filter chain (but this requires a global list as well, and this extension is supported only by a few clients).
Is is possible to reprioritize this issue? We have a use case where we have thousands of services behind hundreds of FQDN's that are served by a set of identical envoys, (all using a wildcard TLS cert). We exhibit this exact issue when HTTP/2 is enabled but not when enforcing usage of HTTP/1.1.
Here's the CVE for this vulnerability https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11767
CC @envoyproxy/security-team
One solution would be to send
421 Misdirected Requestresponse to requests for hostnames that are not configured on a given listener/filter chain (but this wouldn't work if*.example.comis configured), or send421 Misdirected Requestresponse to requests for hostnames that are configured on other listeners/filter chains (but this requires a global list of all configured hostnames).
I can think of 3 alternatives:
-
What if you could specify the HTTP response code for an RBAC filter DENY? Then the management server that configured the HCM can add an RBAC policy for the server names that it allows on that HCM and generate 421 on DENY.
-
The management server could program the SNI server name check in the Lua filter, generating a 421 when it didn't match.
-
Add a dedicated filter that can be configured with the acceptable SNI server names.
What is the plan for fixing https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11767 in envoy?
AFAIK there is no plan currently. Someone needs to own this issue and drive a resolution if they are passionate about fixing it.
I hacked up a 421 response when the virtual host lookup fails and this works for a simple case that I tried. If this is a reasonable approach I'd need a bit of help to polish it up and figure out how to write tests for it.
https://gist.github.com/jpeach/e01f5f752eed5ffd09ea1f18634d1fc5
I think I managed to find a workaround:
On Envoy instance I added an "envoy.lua" HTTP filter, that checks if the response code is a 404 (the same code, that is being generated for non-existent route) AND checks if the "x-envoy-upstream-service-time" header is NOT present.
The Lua code:
function envoy_on_response(response_handle)
if response_handle:headers():get(":status") == "404" and response_handle:headers():get("x-envoy-upstream-service-time") == nil then
response_handle:headers():replace(":status", "421")
end
end
Example configuration on Envoy (fetched by LDS):
"http_filters": [
{
"name": "envoy.lua",
"typed_config": {
"@type": "type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua",
"inline_code": `
function envoy_on_response(response_handle)
if response_handle:headers():get(":status") == "404" and response_handle:headers():get("x-envoy-upstream-service-time") == nil then
response_handle:headers():replace(":status", "421")
end
end`
}
},
{
"name": "envoy.filters.http.router"
}
]
I think I managed to find a workaround:
On Envoy instance I added an "envoy.lua" HTTP filter, that checks if the response code is a 404 (the same code, that is being generated for non-existent route) AND checks if the "x-envoy-upstream-service-time" header is NOT present.
Nice! That's a bit cleaner than my equivalent 👍
@jpeach @lunighty Is there a plan to have a fix in Envoy code or do you think the current approach with EnvoyFilter is sufficient?
May be a dumb question - if we have a scenario like
However, if www.example.com (with *.example.com certificate) is served by one listener/filter chain, and app.example.com is served by another listener/filter chain, then we have an issue, since connections are latched to a single listener/filter chain for the lifetime of the connection (again, please correct me if I'm wrong), and if the connection to www.example.com is established first, then requests to app.example.com will be coalesced on the same connection, using configuration for www.example.com, and forwarded to the wrong backend.
(from Piotr above)
How can we distinguish from a browser coalescing the request from someone legitimately sending a request with SNI=ww.example.com HOST=app.example.com? It seems like from Envoy's perspective, these are identical.
Also, RFC 7540 is about HTTP/2. The above example can be done with HTTP 1 - do we expect a 421 still?
I am exploring the relationship among SNI, SAN in cert and Host in Http. I will share a document after
We now use this Lua Snippet:
function envoy_on_request(request_handle)
local streamInfo = request_handle:streamInfo()
if streamInfo:requestedServerName() ~= "" then
if (string.sub(streamInfo:requestedServerName(), 0, 2) = "*." and not string.find(request_handle:headers():get(":authority"), string.sub(streamInfo:requestedServerName(), 1))) then
request_handle:respond({[":status"] == "421"}, "Misdirected Request")
end
if (string.sub(streamInfo:requestedServerName(), 0, 2) ~= "*." and streamInfo:requestedServerName() ~= request_handle:headers():get(":authority")) then
request_handle:respond({[":status"] = "421"}, "Misdirected Request")
end
end
end
EDIT: Fixed for HTTP requests where requestedServerName is empty
EDIT 2: Fixed for Wildcard SNI described by @lambdai
@maennchen Awesome! Any idea on how to implement it with Istio?
We now use this Lua Snippet:
function envoy_on_request(request_handle) if streamInfo:requestedServerName() != request_handle:headers():get(":authority") then request_handle:respond({[":status"] = "421"}, "Misdirected Request") end end
Does it work in Safari for you? (we found out that Safari completely ignores 421 code and does not retry that request)
@dekim Sure:
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: misdirected-request
namespace: istio-system
spec:
workloadSelector:
labels:
app: istio-ingressgateway
configPatches:
- applyTo: HTTP_FILTER
match:
context: GATEWAY
listener:
filterChain:
filter:
name: envoy.filters.network.http_connection_manager
subFilter:
name: envoy.filters.http.router
patch:
operation: INSERT_BEFORE
value:
name: envoy.lua
typed_config:
"@type": "type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua"
inlineCode: |
function envoy_on_request(request_handle)
local streamInfo = request_handle:streamInfo()
if streamInfo:requestedServerName() ~= "" then
if (string.sub(streamInfo:requestedServerName(), 0, 2) == "*." and not string.find(request_handle:headers():get(":authority"), string.sub(streamInfo:requestedServerName(), 1))) then
request_handle:respond({[":status"] = "421"}, "Misdirected Request")
end
if (string.sub(streamInfo:requestedServerName(), 0, 2) ~= "*." and streamInfo:requestedServerName() ~= request_handle:headers():get(":authority")) then
request_handle:respond({[":status"] = "421"}, "Misdirected Request")
end
end
end
@feature-id I haven't tested Safari.
EDIT: Fixed for HTTP requests where requestedServerName is empty
EDIT 2: Fixed for Wildcard SNI described by @lambdai
@maennchen Awesome! Let's see if can somebody test it on Safari.
That would be awesome!
Last time I tested Safari (2020) it didn't support retrying on 421 responses. There have been new releases since then, but the bug I filed wasn't resolved so I doubt it is fixed (or ever will be now).
IMHO, the best approach is to avoid wildcard certificates and issue a new dedicated certificate to every domain.
IMHO, the best approach is to avoid wildcard certificates and issue a new dedicated certificate to every domain.
Unfortunately, this isn't an option if you have over 9000 domains. :)
Currently, Envoy has two arguable behaviors
- HCM doesn't verify requested server name(SNI) with :authority. Theoretically, the first http stream in within the connection can raise the issue.
Should we provide an option in HCM to return 421 on the above conflict?
- As a follow up, since the per http request :authority and request server name may be more concrete. e.g. requested "*.foo.com", :authority "bar.foo.com". Meanwhile, the listener provides two filter chain matches
- requested server name = "bar.foo.com"
- requested server name = "*.foo.com".
If the client's request is {SNI="*.foo.com", :authority = "bar.foo.com"}, should we provide an another option to reject the http request because "bar.foo.com" has a better SNI match filter chain?
I'm not sure about the second option, but the first one would let us in Contour remove a Lua workaround we have that does the same thing. You can see the whole thing at https://github.com/projectcontour/contour/blob/40d5c259697ca92693071bc6977a1c8da08555b0/internal/envoy/v3/listener.go#L565-L603
@maennchen
Thank you for sharing. Unfortunately, some Safari doesn't seem to support 421 response code.
| Browser | OS | Version | 421 Response code supported | Protocol |
|---|---|---|---|---|
| Safari | macOS Monterey | 15.1 | YES | HTTP/2 (h2) |
| Safari | macOS Big Sur | 14.1 | NO | HTTP/2 (h2) |
| Safari | macOS Catalina | 13.1 | NO | HTTP/2 (h2) |
| Safari | macOS Mojave | 12.1 | NO | HTTP/2 (h2) |
- https://bugs.webkit.org/show_bug.cgi?id=192926
Unfortunately, some Safari doesn't seem to support 421 response code.
They fixed it in Monterey!? That's great news :partying_face: :champagne:
@maennchen Your lua code is incorrect. lua's indices start at 1 instead of 0. So the code should be as follows:
function envoy_on_request(request_handle)
local streamInfo = request_handle:streamInfo()
if streamInfo:requestedServerName() ~= "" then
if (string.sub(streamInfo:requestedServerName(), 1, 2) = "*." and not string.find(request_handle:headers():get(":authority"), string.sub(streamInfo:requestedServerName(), 2))) then
request_handle:respond({[":status"] == "421"}, "Misdirected Request")
end
if (string.sub(streamInfo:requestedServerName(), 1, 2) ~= "*." and streamInfo:requestedServerName() ~= request_handle:headers():get(":authority")) then
request_handle:respond({[":status"] = "421"}, "Misdirected Request")
end
end
end
(random passer by) @htuch This is an example of a very valuable and simple use case that needs Lua.
@kyessenov ack. Presumably Wasm could also work here at the expense of a heavier weight everything.
(random passer by) @htuch This is an example of a very valuable and simple use case that needs Lua.
This is such an integral part of HTTP/2, that it should be fixed in Envoy core, instead of requiring users to implement workarounds in Lua/Wasm.