axum icon indicating copy to clipboard operation
axum copied to clipboard

Spoofable extractors are used with the knowledge of the risks

Open yanns opened this issue 1 year ago • 22 comments

The ticket follows the discussion in https://github.com/tokio-rs/axum/pull/2507#issuecomment-2423925900

Some extractors, like Host or Scheme, can use the values of some HTTP headers that could be spoofed by malicious users.

We should find a way to make users aware of the risks of using those extractors.

Some ideas:

  • using unsafe. This is not the idea of unsafe and we would be mis-using it. I think that this can be discarded.
  • encapsulating the value in a new struct like SpoofableValue so that users have to call some function to get the value. The name and the documentation of the function should make the user aware of the risk. Example:
async fn handler(Host(host): Host) -> String {
  val value = host.spoofable_value();
  value
}

yanns avatar Oct 19 '24 20:10 yanns

Perhaps something along the lines of:

/// Wrap spoofable extractor
pub struct Spoofable<E>(pub E);

/// Allow `Spoofable` to be used with spoofable extractors in handlers
impl <S, E> FromRequestParts<S> for Spoofable<E> where E: FromSpoofableRequestParts<S> {

}

/// axum private trait
trait FromSpoofableRequestParts<S>: Sized {
    type Rejection: IntoResponse;

    async fn from_request_parts(
        parts: &mut Parts, 
        state: &S
    ) -> impl Future<Output = Result<Self, Self::Rejection>> + Send;
}

/// Mark `Host` as a spoofable extractor
impl <S> FromSpoofableRequestParts<S> for Host { ... } 

/// Use spoofable extractor
async fn handler(Spoofable(Host(host)): Spoofable<Host>) -> String {
    println("{host}");
}

bengsparks avatar Oct 19 '24 22:10 bengsparks

I've made one PoC so that we can better imagine how the API would be: https://github.com/tokio-rs/axum/pull/3000

@bengsparks could you also make one for the approach you're suggesting. It seems very interesting!

yanns avatar Oct 20 '24 10:10 yanns

Is it possible to add on either of the extractors something like Host::unspoofable_value(&self) -> Option<String>?

I don't think host can be extracted from anything that cannot be spoofed and scheme could theoretically be extracted from connect info, but the way it is implemented now, it prefers the scheme the client used originally if the server is behind a proxy, i.e. it tries to extract from the proxy headers first which might be what the user is interested in.

If we can only return values extracted from spoofable sources, I feel like the destructuring is the nicer syntax from the current two options, but that's just my opinion. Getting rid of the Spoofable wrapper first also allows users to pass around Host in type-safe manner and we can implement Deref and Into for convenience. If we go with the first option, users would either have to call spoofable_value at every usage site or they would have to pass around a String. Implementing Into or Deref would completely circumvent forcing users to be explicit about acknowledging the spoofable scenario so that could never be added.

For completeness, would you be opposed to just having spoofable-extractors feature which would gate Host and Scheme in their current implementation? It would reduce the noise in handler signatures and users still have to opt-in, although just once for all of them and not explicitly for each use. I guess the question is if it's explicit enough.

mladedav avatar Oct 21 '24 08:10 mladedav

How about Host<WithProxyHeaders> and Host<WithoutProxyHeaders> as an alternative? I find "spoofable" sounds a bit awkward, and while the proxy thing may not sound as dangerous, it would still get people thinking.

jplatte avatar Oct 21 '24 08:10 jplatte

I personal like having to change the usage site. I guess it would be very easy to have a function taking a Scheme and forgetting about the risks of using it. Being force to call spoofable_value makes sure that the person taking care of this particular implementation will be reminded of the consequences.

yanns avatar Oct 21 '24 08:10 yanns

How about Host<WithProxyHeaders> and Host<WithoutProxyHeaders>

I would see that as another dimension because both the proxy headers and the host header can be spoofed.

mladedav avatar Oct 21 '24 09:10 mladedav

Okay so I don't hate any of the options presented so far. They all seem a bit weird but that's almost the point, so not too surprising. @yanns, @mladedav if you can agree on a best solution, feel free to go ahead and merge the corresponding PR and close this issue.

jplatte avatar Oct 21 '24 18:10 jplatte

@yanns I personally see it as the extractor doing the "unsafe" operation. You ask for the Host and acknowledge that you know what you're doing and then you can pass that value around and use it however you want. Plus the potential for implementing some traits I mentioned before.

This should also be easier to migrate to (which might be bad since people won't have to think about every usage site of Host).

But if you really think it would be better to have users call the spoofable_value method every time, I'll yield (unless @bengsparks wants to argue for the Spoofable<T> wrapper more).

mladedav avatar Oct 23 '24 13:10 mladedav

I personally like my solution for the fact that migration is simple and that the use-site is clearly visible.

I'd argue that outside of creating different extractors for reading different headers, there is no fool-proof way to achieve the desired goal. If a user doesn't want to read docs / ensure safety / take other precautions, then there is nothing to be done.

With my way, there is a handler-specific marker that calls for further inspection during code review instead of potentially being buried inside of the handler at any given position.

Suggestions and adjustments to the naming of Spoofable and other nomenclature in my PR are welcome :)

bengsparks avatar Oct 23 '24 14:10 bengsparks

My only fear is about:

  • one developer does a change to use a spoofable value.
  • later another developer is using this value without knowing that it can be spoofed.

But I don't have strong opinion here. I guess I'm very (too) sensitive to developers not being careful about security...

yanns avatar Oct 29 '24 10:10 yanns

I have the feeling that the community consensus is tending towards Spoofable<T> wrapper. To make progress, let's go for this. It's always a step in more security awareness.

yanns avatar Nov 07 '24 03:11 yanns

My other comment is that the current Host extractor is misleading. Personally, I'd have used it by assuming it's only using the Host header. By making it Spoofable, it's clearer that the value van be read from different headers.

yanns avatar Nov 07 '24 04:11 yanns

One thing to note is that while this does raise awareness, there is no way for the user to act on that knowledge. The user can choose to not use the extractor at all, but a far more valuable ability would be to selectively choose only the non-spoofable part, which is not possible, as far as I can tell.

Most notably, I don't see any way that the extractors could be used to implement things like the common TRUSTED_PROXIES setting, where the X-Forwarded-* headers are considered safe selectively based on the source IP address.

sclu1034 avatar Nov 07 '24 08:11 sclu1034

@sclu1034 I agree with you. What kind of approach would you see here? Should we change those extractors to:

  • have extractors only about non-spoofable values
  • have separated extractors also considering the spoofable part
  • add extractor where we can configure the trusted_proxies ?

yanns avatar Nov 13 '24 14:11 yanns

Ideally, it would be a solution where the logic whether a connection can be trusted has to be implemented only once, rather than again for every extractor or handler.

So I think some kind of middleware that can dynamically strip headers would be best. At that point, the extractors could stay as they are, and simply act as if that header was never sent. And a centralized solution like that would also ensure that manual access to these headers can be trusted likewise.

sclu1034 avatar Nov 13 '24 15:11 sclu1034

+1 to the middleware suggestion by @sclu1034; the best solution would be such headers simply never arrive. I was about to cite this comment in support, but then I realised that they're also the author thereof 😄

bengsparks avatar Nov 14 '24 11:11 bengsparks

FWIW after reading up on the discussion here, I'm not a huge fan of the spoofable proposal 🫣

if the goal is to make users aware of the spoofing risks of the Host and Scheme extractors, how about making these extractors configurable and defaulting to only reading from "safe" sources? that way, the user has to explicitly configure the extractor to read from a spoofable source, making them aware of the risk.

if we introduce FromSpoofableRequestParts, do we then also introduce OptionalFromSpoofableRequestParts? it feels like this might be getting out of hand soon 😅

Turbo87 avatar Dec 26 '24 09:12 Turbo87

Well, I don't think there is any non-spoofable source for host.

But just implementing impl FromRequestParts for Spoofable<Scheme> and any other types excplicitly wouldn't add more traits. That would mean users cannot implement the same for their own types and we couldn't in axum-extra, which might be desirable for some libraries but I don't think that's too bad.

mladedav avatar Jan 06 '25 16:01 mladedav

I don't think there is any non-spoofable source for host.

What's there to "spoof" anyways? Host is not usually used by reverse proxies, so no third party involved. And curl -H 'Host: foo.com' bar.com, curl -H 'Host: foo.com' foo.com or curl -H 'Host: foo.com' 1.2.3.4 all look the same to the server:

GET / HTTP/1.1
Host: foo.com

sclu1034 avatar Jan 13 '25 09:01 sclu1034

Very late to (this) party but introducing the opposite of "spoofable" would allow the possibility of differentiating between a bunch of options of why a value is "safe" (also avoids the convention clash with "unsafe"):

  • The value has been set by the framework and is not influenced by user input (that is the scope of this discussion)
  • The value is a member of a constant set
  • The value has cryptographic integrity protections - here additional variants could be possible such as
    • ... and not influenced by user input (such as a signed cookie)
    • ... and protected against replay attacks (such as a JWT with iat and exp)

Just my 5 cents.

yawn avatar Mar 25 '25 16:03 yawn

What's there to "spoof" anyways? Host is not usually used by reverse proxies, so no third party involved. And curl -H 'Host: foo.com' bar.com, curl -H 'Host: foo.com' foo.com or curl -H 'Host: foo.com' 1.2.3.4 all look the same to the server:

When a reverse proxy or web server is hosting multiple domains (a common setup called "virtual hosting"), it uses the Host header to determine which upstream (backend) service or site should handle the request.

yshing avatar Jun 15 '25 08:06 yshing

When a reverse proxy or web server is hosting multiple domains (a common setup called "virtual hosting"), it uses the Host header to determine which upstream (backend) service or site should handle the request.

And how would that be used to spoof anything?

X-Forwarded-For is "spoofable" insofar that there are three systems in the path: client, proxy & server, where the server doesn't have direct contact to the client, and needs to trust the reverse proxy to be given correct information about the client's identity. If the client is allowed to set X-Forwarded-For to arbitrary values, and the server trusts it as if came from the reverse proxy, then the client can make the server believe they are someone else. That's the spoofing.

But when the client sets Host: foo.com then it simply gets the content for the domain foo.com. If Host: bar.com, it gets the content for bar.com. And that's strictly equivalent to typing foo.com or bar.com in your address bar. The Host header cannot be used to identify a client in the first place, so there is nothing to spoof by it.

sclu1034 avatar Jun 16 '25 08:06 sclu1034

But when the client sets Host: foo.com then it simply gets the content for the domain foo.com. If Host: bar.com, it gets the content for bar.com. And that's strictly equivalent to typing foo.com or bar.com in your address bar. The Host header cannot be used to identify a client in the first place, so there is nothing to spoof by it.

@sclu1034 See the current implementation of Host extractor here:

https://github.com/tokio-rs/axum/blob/ff031867df7126abe288f13a62c51849c9e544af/axum-extra/src/extract/host.rs#L17-L20

yshing avatar Jul 29 '25 17:07 yshing