async-http icon indicating copy to clipboard operation
async-http copied to clipboard

[HTTP1.1] `Host` header sent twice if also specified in request headers

Open cocoahero opened this issue 1 year ago • 8 comments

While attempting to use async-http (0.71.0) with against a reverse proxy where we need to specify a Host header, the library sends the header twice: one with the URL's hostname value, and one with the value specified in the request. For most web servers, this results in a 400 Bad Request.

url = URI.parse("https://my.reverse.proxy.com/path/to/resource")

headers = {
  "Host" => "api.example.com"
}

body = JSON.dump({ foo: "bar" })

Sync do
  Async::HTTP::Internet.post(url, headers, body) do |resp|
    puts resp.inspect
  end
end

It looks like this is coming from protocol-http1 where the host header is always written, irrespective if it also exists in headers. https://github.com/socketry/protocol-http1/blob/main/lib/protocol/http1/connection.rb#L133-L138

The reverse proxy use case is one, but it could also be for any situation where the TCP/TLS socket connection should be made to a different authority than what the HTTP Host header dictates. For example, this should also be completely valid:

url = URI.parse("http://localhost:9292/path/to/resource")

headers = {
  "Host" => "example.multi.tenant.app.com" # used to lookup SaaS tenant, etc
}

body = JSON.dump({ foo: "bar" })

Sync do
  Async::HTTP::Internet.post(url, headers, body) do |resp|
    puts resp.inspect
  end
end

cocoahero avatar Aug 28 '24 22:08 cocoahero

What you are trying to do is possible, but it's probably not achievable with Async::HTTP::Internet which is a canonical/opinionated interface for accessing internet resources following standard conventions.

More specifically, the host header is not part of HTTP/2+, and therefore the library has a specific field for dealing with host identification, referred to as authority. In HTTP/1, authority maps directly to the host header, and in HTTP/2+ it maps to the :authority pseudo header.

Note that this can be different from the address you connect to (in your case the reverse proxy), and even different from the hostname used for TLS negotiation (can be used to select certificates for validating the connection is secure).

According to your initial problem, something like this will work for you:

server_endpoint = ::IO::Endpoint.tcp("my.reverse.proxy.com", 443)
host_endpoint = ::Async::HTTP::Endpoint.parse("https://api.example.com/path/to/resource", server_endpoint)
client = ::Async::HTTP::Client.new(host_endpoint)

client.post(...)

This will configure the authority and TLS as if it was talking to api.example.com.

ioquatix avatar Aug 28 '24 23:08 ioquatix

Correct me if I am wrong, but creating a client in this way will prevent connection reuse/persistent connections since we will need a Client instance per combination of TCP endpoint and host/authority. In reality, we want to reuse the connection to the TCP endpoint, but just change the Host header per request.

cocoahero avatar Aug 29 '24 14:08 cocoahero

It will depend on your TLS setup, if you are expecting SNI (which is at the connection level, usually for virtual hosts) to work, you will need the setup I suggested, and yes, you will want to cache the client instances and yes, you are correct, it will be one connection at least per SNI hostname. If you don't need SNI, and only need to change the authority, you can do this instead:

def get(client, url, headers = Protocol::HTTP::Headers.new)
  client.call(
    Protocol::HTTP::Request.new(nil, url.hostname, "GET", url.path, headers, body)
  )
end

The 2nd argument to the Request.new is the authority (host) to use.

If this was more commonly requested, we could consider adding a keyword argument to the method helpers, e.g. internet.get(..., authority: ...) - however I don't know if your issue encompasses SNI or not - can you give me more details about that part of your setup?

ioquatix avatar Aug 30 '24 01:08 ioquatix

We are essentially trying to do the equivalent of the following Go code with async-http, where we want to keep a pool of persistent connections to the reverse proxy.

The reverse proxy only has a TLS certificate for my.tld.com but has millions of virtual hosts.

We use Cloudflare SaaS where they connect to the reverse proxy/loadbalancer with TLS/SNI set to my.tld.com and the Host/:authority header set to the tenant's domain.

https://developers.cloudflare.com/cloudflare-for-platforms/cloudflare-for-saas/reference/connection-details/

Show Code
package main

import (
	  "crypto/tls"
	  "fmt"
	  "io"
	  "net"
	  "net/http"
	  "net/http/httptrace"
	  "time"
)

func main() {
	  // Create a custom dialer
	  dialer := &net.Dialer{
		  Timeout:   30 * time.Second,
		  KeepAlive: 30 * time.Second,
	  }

	  // Create a custom TLS config
	  tlsConfig := &tls.Config{
		  ServerName: "nuc.adi.run",
	  }

	  // Create a custom transport
	  transport := &http.Transport{
		  DialContext:         dialer.DialContext,
		  TLSClientConfig:     tlsConfig,
		  ForceAttemptHTTP2:   true,
		  MaxIdleConns:        100,
		  IdleConnTimeout:     90 * time.Second,
		  TLSHandshakeTimeout: 10 * time.Second,
	  }

	  // for trace events
	  clientTrace := &httptrace.ClientTrace{
		  GetConn: func(hostPort string) { fmt.Println("starting to create conn ", hostPort) },
		  GotConn: func(info httptrace.GotConnInfo) { fmt.Printf("connection established %+v\n", info) },
		  TLSHandshakeDone: func(info tls.ConnectionState, err error) {
			  fmt.Printf("TLS handshake done %+v\n", info)
		  },
		  WroteHeaderField: func(key string, value []string) {
			  if key == ":authority" {
				  fmt.Println("Wrote header field ", key, value)
			  }
		  },
		  ConnectStart: func(network, addr string) { fmt.Println("starting tcp connection", network, addr) },
		  ConnectDone:  func(network, addr string, err error) { fmt.Println("tcp connection created", network, addr, err) },
	  }

	  // Create a client with the custom transport
	  client := &http.Client{
		  Transport: transport,
	  }

	  // Create a new request
	  req, err := http.NewRequest("GET", "https://nuc.adi.run", nil)
	  if err != nil {
		  fmt.Println("Error creating request:", err)
		  return
	  }
	  doRequest(client, clientTrace, req, "code.aditya.me")

	  fmt.Println()

	  req, err = http.NewRequest("GET", "https://nuc.adi.run", nil)
	  if err != nil {
		  fmt.Println("Error creating request:", err)
		  return
	  }
	  doRequest(client, clientTrace, req, "jellyfin.aditya.me")
}

func doRequest(client *http.Client, clientTrace *httptrace.ClientTrace, req *http.Request, overrideHostName string) {
	  clientTraceCtx := httptrace.WithClientTrace(req.Context(), clientTrace)
	  req = req.WithContext(clientTraceCtx)

	  // override Host header
	  req.Host = overrideHostName

	  resp, err := client.Do(req)
	  if err != nil {
		  fmt.Println("Error performing request:", err)
		  return
	  }
	  defer resp.Body.Close()

	  io.Copy(io.Discard, resp.Body)
	  fmt.Println("Response status:", resp.Status)
}

go run main.go 
starting to create conn  nuc.adi.run:443
starting tcp connection tcp 10.0.0.2:443
tcp connection created tcp 10.0.0.2:443 <nil>
TLS handshake done {Version:772 HandshakeComplete:true DidResume:false CipherSuite:4865 NegotiatedProtocol:h2 NegotiatedProtocolIsMutual:true ServerName:nuc.adi.run PeerCertificates:[0x140001d8008 0x140001d8588] VerifiedChains:[[0x140001d8b08 0x140001d9088 0x140001d9608]] SignedCertificateTimestamps:[] OCSPResponse:[] TLSUnique:[] ECHAccepted:false ekm:0x104d76c80 testingOnlyDidHRR:false testingOnlyCurveID:29}
connection established {Conn:0x14000192008 Reused:false WasIdle:false IdleTime:0s}
Wrote header field  :authority [code.aditya.me]
starting to create conn  nuc.adi.run:443
connection established {Conn:0x14000192008 Reused:true WasIdle:true IdleTime:53.75µs}
Wrote header field  :authority [code.aditya.me]
Response status: 200 OK

starting to create conn  nuc.adi.run:443
connection established {Conn:0x14000192008 Reused:true WasIdle:true IdleTime:82.5µs}
Wrote header field  :authority [jellyfin.aditya.me]
starting to create conn  nuc.adi.run:443
connection established {Conn:0x14000192008 Reused:true WasIdle:true IdleTime:30.375µs}
Wrote header field  :authority [jellyfin.aditya.me]
Response status: 200 OK

epk avatar Aug 31 '24 09:08 epk

Okay, I understand, SNI is not an issue for you.

Similar request has come up in the past, so I've introduced support for keyword arguments to Async::HTTP::Internet, you can specify authority: in your case to set the host header.

See https://socketry.github.io/async-http/releases/index.html#async::http::internet-accepts-keyword-arguments for more details.

ioquatix avatar Aug 31 '24 23:08 ioquatix

Thanks @ioquatix, this unblocks us for now. As an aside, do you think it would also be beneficial to add a sanity check or other defensive measure to ensure the host header is not sent twice? It was not immediately intuitive when we received 400s on the reason why, and we were only able to figure it out because we had access to nginx logs.

cocoahero avatar Sep 04 '24 20:09 cocoahero

Unfortunately, it appears that this change does not play well with WebMock. They use the request.authority when building the stub signature, which results in looking for the wrong host name. https://github.com/bblimke/webmock/blob/master/lib/webmock/http_lib_adapters/async_http_client_adapter.rb#L97-L106

cocoahero avatar Sep 18 '24 15:09 cocoahero

Thanks, I'll take a look.

ioquatix avatar Sep 19 '24 22:09 ioquatix