isahc icon indicating copy to clipboard operation
isahc copied to clipboard

Expose headers from an HTTP tunnel proxy in the response

Open ta3pks opened this issue 4 years ago • 8 comments

Add the ability to access headers returned by proxy layers involved in an HTTP request. When an HTTP response is returned through an HTTP tunnel proxy using the CONNECT method, the proxy server returns an initial response confirming the proxy connection, after which all further socket data is proxied to the upstream server. Typically an initial response looks like this:

HTTP/1.1 200 OK

but it is also legal to include response headers in this response, like this:

HTTP/1.1 200 OK
X-Proxy-Name: Mycorp Proxy 2

There is no specific protocol use for these headers for a tunnel proxy, but it can be useful to expose these in the HTTP client regardless. Note that these headers should not be mixed with origin headers as that could cause conflicts or security problems if proxy headers are treated like they came from the origin, so they must be exposed using a separate API.

Proposed API

pub trait ResponseExt {
    /// If this response was returned through an HTTP proxy, returns any
    /// headers the proxy may have set when the proxy connection started.
    fn proxy_headers(&self) -> Option<&HeaderMap<HeaderValue>>;
}

// Now we'll use the API
let response = isahc::get("https://example.org")?;

// Headers from server
for (name, value) in response.headers().iter() {
    println!("origin header: {} = {}", name, value);
}

// Headers from proxy
for (name, value) in response.proxy_headers().iter() {
    println!("proxy header: {} = {}", name, value);
}

Implementation notes

Extra response state in Isahc that is outside of the basic HTTP abstraction are stored using extensions, which is a type-map of arbitrary data that can be stored in a Request or Response. To ensure that these extensions don't become part of our public API, Isahc uses types that are private to the crate so that users can't access these extensions without going through our public API (since the type is unnameable outside Isahc and thus can't be used as a map key).

To persist proxy headers somewhere, we can use our existing private Proxy<T> decorator type and use Proxy<HeaderMap<HeaderValue>> as our extension.

Curl already knows about these headers in a roundabout way, as they are already given to us in our header callback. We just need to infer when the header is from a proxy or not. We can probably figure that out when curl gives us the response status line in the callback -- if we receive multiple response statuses, then the first one is likely from the proxy. (Potential gotcha: is the header from a proxy, or is it from an intermediate response when following redirects?)

Implementation of ResponseExt::proxy_headers is simple and just requires fetching our Proxy<HeaderMap<HeaderValue>> extension from the response extensions, if it exists, and returning a reference to the inner header map.

Original issue description

My proxy company adds extra layer of headers on top of the response. Is there any way to access those?

image In php I can grab the headers without doing anything extra but in isahc those headers simply do not exist

ta3pks avatar Jan 02 '20 04:01 ta3pks

Request headers or response headers? I'm not sure what you mean. If an outgoing proxy is adding request headers, it is impossible for you to see those, because the request has already been sent. If an incoming proxy is adding response headers then those should show up in response.headers() just fine, since it should be indistinguishable from headers from the actual server.

I'm not sure how PHP is relevant here, unless you mean that a PHP server that is receiving a request through the proxy can see the headers. Of course it can, it is receiving the request. Isahc is sending the request before the proxy is involved.

Perhaps I misunderstand what you are asking.

sagebind avatar Jan 02 '20 16:01 sagebind

sorry I meant response headers @sagebind

ta3pks avatar Jan 02 '20 17:01 ta3pks

There are 2 layers of response headers as you may see in the photo In isahc req.headers() only return the latter however in php i can access both responses headers

ta3pks avatar Jan 02 '20 17:01 ta3pks

I'm not sure what you mean by "2 layers". Is the screenshot the raw response from the server? That would be a malformed HTTP/1.1 response which we make no guarantees about supporting.

Does it work with curl?

sagebind avatar Jan 02 '20 17:01 sagebind

yes the screenshot is from curl and again with PHP curl, I can access those headers I mean the first partial headers which I guess is not valid HTTP but still, PHP curl supports those But apparently they are thrown away by isahc.

ta3pks avatar Jan 02 '20 18:01 ta3pks

Edit: I see what's going on here; it looks like an HTTP CONNECT tunnel proxy is being used, and you are interested in retrieving the headers returned by the tunnel server.

But apparently they are thrown away by isahc.

They aren't "thrown away" in that sense, they are just ignored. Isahc's goal is to return an http::Response that faithfully represents the response returned from the server. Merging the tunnel proxy's headers and the server's response headers into a single map would be misleading and could potentially cause issues if there are clashes in header names.

I could see adding a new API to access these, something like:

let response = isahc::get("https://example.org")?;

// Headers from server
for (name, value) in response.headers().iter() {
    println!("proxy: {} = {}", name, value);
}

// Headers from proxy
for (name, value) in response.proxy_headers().iter() {
    println!("proxy: {} = {}", name, value);
}

sagebind avatar Jan 04 '20 19:01 sagebind

That design looks really nice

ta3pks avatar Jan 05 '20 16:01 ta3pks

Updated issue description with the proposed API and implementation notes on how this could be added.

sagebind avatar Oct 02 '20 17:10 sagebind