envoy icon indicating copy to clipboard operation
envoy copied to clipboard

Envoy does not adhere to HTTP/2 RFC 7540

Open trevorlinton opened this issue 6 years ago • 36 comments
trafficstars

Title: Envoy does not adhere to HTTP/2 RFC 7540

Description:

RFC 7540 Section 9.1.1 and 9.1.2 specifies when a request coming in through a re-used HTTP/2 connection is accidentally sent to a non-origin but authoritative server that a 421 response should be returned. This can happen if two servers one with a wildcard certificate (e.g., a.example.com) and another server (b.example.com) with a non-wildcard on the same IP address using SNI responds to requests those meant for server b.example.com will accidentally be forwarded down the re-used HTTP/2 connection for a.example.com. In this situation a.example.com should send back a 421 to indicate the request was destined for b.example.com. This forces browsers to re-establish a new connection, re-negotiate the SNI, and thus the backing server and subsequently route to the correct origin.

[optional Relevant Links:]

  • https://tools.ietf.org/html/rfc7540#section-9.1.1 - section describing connection re-use
  • https://tools.ietf.org/html/rfc7540#section-9.1.2 - section describing misdirected response
  • https://bugs.chromium.org/p/chromium/issues/detail?id=954160#c5 - The bug was originally filed against Chromium however they indicated Istio was the issue.
  • https://github.com/istio/istio/issues/13589 - The bug was then filed against istio who indicated envoy was the issue.

trevorlinton avatar May 01 '19 14:05 trevorlinton

@alyssawilk @PiotrSikora thoughts on this? I haven't read the relevant RFCs in detail to fully understand what is needed.

mattklein123 avatar May 02 '19 18:05 mattklein123

Intersection of H2 specs and TLS handshake? I eagerly anticipate Piotr sorting this out :-P

alyssawilk avatar May 02 '19 19:05 alyssawilk

The gist is that browsers coalesce HTTP/2 connections pretty aggressively: when a browser opens connection to www.example.com and is presented with a certificate for *.example.com during TLS handshake, then it will re-use this connection for all requests to *.example.com, as long as the hostname resolves to the same IP (and some browsers don't even care about that).

As long as all *.example.com hostnames are served by the same listener/filter chain, this shouldn't be an issue in Envoy, since routing is happening on a per-request, and not per-connection basis (please correct me if I'm wrong).

However, if www.example.com (with *.example.com certificate) is served by one listener/filter chain, and app.example.com is served by another listener/filter chain, then we have an issue, since connections are latched to a single listener/filter chain for the lifetime of the connection (again, please correct me if I'm wrong), and if the connection to www.example.com is established first, then requests to app.example.com will be coalesced on the same connection, using configuration for www.example.com, and forwarded to the wrong backend.

One solution would be to send 421 Misdirected Request response to requests for hostnames that are not configured on a given listener/filter chain (but this wouldn't work if *.example.com is configured), or send 421 Misdirected Request response to requests for hostnames that are configured on other listeners/filter chains (but this requires a global list of all configured hostnames).

Another solution would be using HTTP/2 ORIGIN frame (RFC8336) to advertise allowed hostnames on a given listener/filter chain (but this requires a global list as well, and this extension is supported only by a few clients).

PiotrSikora avatar May 02 '19 20:05 PiotrSikora

Is is possible to reprioritize this issue? We have a use case where we have thousands of services behind hundreds of FQDN's that are served by a set of identical envoys, (all using a wildcard TLS cert). We exhibit this exact issue when HTTP/2 is enabled but not when enforcing usage of HTTP/1.1.

jcrowthe avatar Aug 20 '19 21:08 jcrowthe

Here's the CVE for this vulnerability https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11767

haf avatar Apr 15 '20 07:04 haf

CC @envoyproxy/security-team

htuch avatar Apr 15 '20 14:04 htuch

One solution would be to send 421 Misdirected Request response to requests for hostnames that are not configured on a given listener/filter chain (but this wouldn't work if *.example.com is configured), or send 421 Misdirected Request response to requests for hostnames that are configured on other listeners/filter chains (but this requires a global list of all configured hostnames).

I can think of 3 alternatives:

  1. What if you could specify the HTTP response code for an RBAC filter DENY? Then the management server that configured the HCM can add an RBAC policy for the server names that it allows on that HCM and generate 421 on DENY.

  2. The management server could program the SNI server name check in the Lua filter, generating a 421 when it didn't match.

  3. Add a dedicated filter that can be configured with the acceptable SNI server names.

jpeach avatar Apr 28 '20 07:04 jpeach

What is the plan for fixing https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11767 in envoy?

HarinadhD avatar Jun 18 '20 03:06 HarinadhD

AFAIK there is no plan currently. Someone needs to own this issue and drive a resolution if they are passionate about fixing it.

mattklein123 avatar Jun 18 '20 15:06 mattklein123

I hacked up a 421 response when the virtual host lookup fails and this works for a simple case that I tried. If this is a reasonable approach I'd need a bit of help to polish it up and figure out how to write tests for it.

https://gist.github.com/jpeach/e01f5f752eed5ffd09ea1f18634d1fc5

jpeach avatar Jul 17 '20 01:07 jpeach

I think I managed to find a workaround:

On Envoy instance I added an "envoy.lua" HTTP filter, that checks if the response code is a 404 (the same code, that is being generated for non-existent route) AND checks if the "x-envoy-upstream-service-time" header is NOT present.

The Lua code:

function envoy_on_response(response_handle)
    if response_handle:headers():get(":status") == "404" and response_handle:headers():get("x-envoy-upstream-service-time") == nil then
        response_handle:headers():replace(":status", "421")
    end
end

Example configuration on Envoy (fetched by LDS):

"http_filters": [
    {
        "name": "envoy.lua",
        "typed_config": {
            "@type": "type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua",
            "inline_code": `
function envoy_on_response(response_handle)
    if response_handle:headers():get(":status") == "404" and response_handle:headers():get("x-envoy-upstream-service-time") == nil then
        response_handle:headers():replace(":status", "421")
    end
end`
        }
    },
    {
        "name": "envoy.filters.http.router"
    }
]

lunighty avatar Sep 07 '20 04:09 lunighty

I think I managed to find a workaround:

On Envoy instance I added an "envoy.lua" HTTP filter, that checks if the response code is a 404 (the same code, that is being generated for non-existent route) AND checks if the "x-envoy-upstream-service-time" header is NOT present.

Nice! That's a bit cleaner than my equivalent 👍

jpeach avatar Sep 07 '20 06:09 jpeach

@jpeach @lunighty Is there a plan to have a fix in Envoy code or do you think the current approach with EnvoyFilter is sufficient?

GoelDeepak avatar Dec 08 '20 18:12 GoelDeepak

May be a dumb question - if we have a scenario like

However, if www.example.com (with *.example.com certificate) is served by one listener/filter chain, and app.example.com is served by another listener/filter chain, then we have an issue, since connections are latched to a single listener/filter chain for the lifetime of the connection (again, please correct me if I'm wrong), and if the connection to www.example.com is established first, then requests to app.example.com will be coalesced on the same connection, using configuration for www.example.com, and forwarded to the wrong backend.

(from Piotr above)

How can we distinguish from a browser coalescing the request from someone legitimately sending a request with SNI=ww.example.com HOST=app.example.com? It seems like from Envoy's perspective, these are identical.

Also, RFC 7540 is about HTTP/2. The above example can be done with HTTP 1 - do we expect a 421 still?

howardjohn avatar Feb 10 '21 01:02 howardjohn

I am exploring the relationship among SNI, SAN in cert and Host in Http. I will share a document after

lambdai avatar Feb 11 '21 01:02 lambdai

We now use this Lua Snippet:

function envoy_on_request(request_handle)
  local streamInfo = request_handle:streamInfo()
  if streamInfo:requestedServerName() ~= "" then
    if (string.sub(streamInfo:requestedServerName(), 0, 2) = "*." and not string.find(request_handle:headers():get(":authority"), string.sub(streamInfo:requestedServerName(), 1))) then
      request_handle:respond({[":status"] == "421"}, "Misdirected Request")
    end
    if (string.sub(streamInfo:requestedServerName(), 0, 2) ~= "*." and streamInfo:requestedServerName() ~= request_handle:headers():get(":authority")) then
      request_handle:respond({[":status"] = "421"}, "Misdirected Request")
    end
  end
end

EDIT: Fixed for HTTP requests where requestedServerName is empty
EDIT 2: Fixed for Wildcard SNI described by @lambdai

maennchen avatar May 04 '21 13:05 maennchen

@maennchen Awesome! Any idea on how to implement it with Istio?

mike-dube avatar May 04 '21 13:05 mike-dube

We now use this Lua Snippet:

function envoy_on_request(request_handle)
  if streamInfo:requestedServerName() != request_handle:headers():get(":authority") then
    request_handle:respond({[":status"] = "421"}, "Misdirected Request")
  end
end

Does it work in Safari for you? (we found out that Safari completely ignores 421 code and does not retry that request)

feature-id avatar May 04 '21 13:05 feature-id

@dekim Sure:

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: misdirected-request
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      app: istio-ingressgateway
  configPatches:
    - applyTo: HTTP_FILTER
      match:
        context: GATEWAY
        listener:
          filterChain:
            filter:
              name: envoy.filters.network.http_connection_manager
              subFilter:
                name: envoy.filters.http.router
      patch:
        operation: INSERT_BEFORE
        value:
          name: envoy.lua
          typed_config:
              "@type": "type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua"
              inlineCode: |
                function envoy_on_request(request_handle)
                  local streamInfo = request_handle:streamInfo()
                  if streamInfo:requestedServerName() ~= "" then
                    if (string.sub(streamInfo:requestedServerName(), 0, 2) == "*." and not string.find(request_handle:headers():get(":authority"), string.sub(streamInfo:requestedServerName(), 1))) then
                      request_handle:respond({[":status"] = "421"}, "Misdirected Request")
                    end
                    if (string.sub(streamInfo:requestedServerName(), 0, 2) ~= "*." and streamInfo:requestedServerName() ~= request_handle:headers():get(":authority")) then
                      request_handle:respond({[":status"] = "421"}, "Misdirected Request")
                    end
                  end
                end

@feature-id I haven't tested Safari.

EDIT: Fixed for HTTP requests where requestedServerName is empty
EDIT 2: Fixed for Wildcard SNI described by @lambdai

maennchen avatar May 04 '21 14:05 maennchen

@maennchen Awesome! Let's see if can somebody test it on Safari.

That would be awesome!

mike-dube avatar May 04 '21 15:05 mike-dube

Last time I tested Safari (2020) it didn't support retrying on 421 responses. There have been new releases since then, but the bug I filed wasn't resolved so I doubt it is fixed (or ever will be now).

IMHO, the best approach is to avoid wildcard certificates and issue a new dedicated certificate to every domain.

jpeach avatar May 04 '21 22:05 jpeach

IMHO, the best approach is to avoid wildcard certificates and issue a new dedicated certificate to every domain.

Unfortunately, this isn't an option if you have over 9000 domains. :)

feature-id avatar May 11 '21 09:05 feature-id

Currently, Envoy has two arguable behaviors

  1. HCM doesn't verify requested server name(SNI) with :authority. Theoretically, the first http stream in within the connection can raise the issue.

Should we provide an option in HCM to return 421 on the above conflict?

  1. As a follow up, since the per http request :authority and request server name may be more concrete. e.g. requested "*.foo.com", :authority "bar.foo.com". Meanwhile, the listener provides two filter chain matches
  • requested server name = "bar.foo.com"
  • requested server name = "*.foo.com".

If the client's request is {SNI="*.foo.com", :authority = "bar.foo.com"}, should we provide an another option to reject the http request because "bar.foo.com" has a better SNI match filter chain?

lambdai avatar May 25 '21 05:05 lambdai

I'm not sure about the second option, but the first one would let us in Contour remove a Lua workaround we have that does the same thing. You can see the whole thing at https://github.com/projectcontour/contour/blob/40d5c259697ca92693071bc6977a1c8da08555b0/internal/envoy/v3/listener.go#L565-L603

youngnick avatar Jun 17 '21 04:06 youngnick

@maennchen

Thank you for sharing. Unfortunately, some Safari doesn't seem to support 421 response code.

Browser OS Version 421 Response code supported Protocol
Safari macOS Monterey 15.1 YES HTTP/2 (h2)
Safari macOS Big Sur 14.1 NO HTTP/2 (h2)
Safari macOS Catalina 13.1 NO HTTP/2 (h2)
Safari macOS Mojave 12.1 NO HTTP/2 (h2)
  • https://bugs.webkit.org/show_bug.cgi?id=192926

upgle avatar Jan 12 '22 07:01 upgle

Unfortunately, some Safari doesn't seem to support 421 response code.

They fixed it in Monterey!? That's great news :partying_face: :champagne:

jpeach avatar Jan 12 '22 23:01 jpeach

@maennchen Your lua code is incorrect. lua's indices start at 1 instead of 0. So the code should be as follows:

function envoy_on_request(request_handle)
  local streamInfo = request_handle:streamInfo()
  if streamInfo:requestedServerName() ~= "" then
    if (string.sub(streamInfo:requestedServerName(), 1, 2) = "*." and not string.find(request_handle:headers():get(":authority"), string.sub(streamInfo:requestedServerName(), 2))) then
      request_handle:respond({[":status"] == "421"}, "Misdirected Request")
    end
    if (string.sub(streamInfo:requestedServerName(), 1, 2) ~= "*." and streamInfo:requestedServerName() ~= request_handle:headers():get(":authority")) then
      request_handle:respond({[":status"] = "421"}, "Misdirected Request")
    end
  end
end

johnlanni avatar Feb 27 '22 08:02 johnlanni

(random passer by) @htuch This is an example of a very valuable and simple use case that needs Lua.

kyessenov avatar Jan 19 '24 04:01 kyessenov

@kyessenov ack. Presumably Wasm could also work here at the expense of a heavier weight everything.

htuch avatar Jan 19 '24 05:01 htuch

(random passer by) @htuch This is an example of a very valuable and simple use case that needs Lua.

This is such an integral part of HTTP/2, that it should be fixed in Envoy core, instead of requiring users to implement workarounds in Lua/Wasm.

PiotrSikora avatar Jan 23 '24 19:01 PiotrSikora