axum icon indicating copy to clipboard operation
axum copied to clipboard

Extracting `MatchedPath` in middleware will not always give the final matched path

Open dcormier opened this issue 3 years ago • 14 comments

Bug Report

Version

axum v0.6.0-rc.2

Platform

Darwin 21.1.0 Darwin Kernel Version 21.1.0: Wed Oct 13 17:33:23 PDT 2021; root:xnu-8019.41.5~1/RELEASE_X86_64 x86_64

Description

When extracting MatchedPath in middleware, it may not have the correct value if the request end up being handled by a nested Router.

A simplified example:

use std::{
    convert::Infallible,
    fmt::Debug,
    mem,
    task::{Context, Poll},
};

use axum::{
    body::Body, extract::MatchedPath, http::Request, response::Response, routing::get, Router,
};
use futures::future::BoxFuture;
use tower::{Layer, Service, ServiceExt};

#[derive(Debug, Clone)]
pub struct LogService<S>
where
    S: Debug + Clone,
{
    inner: S,
}

impl<S, B> Service<Request<B>> for LogService<S>
where
    S: Service<Request<B>, Response = Response, Error = Infallible>
        + Debug
        + Clone
        + Send
        + 'static,
    S::Future: Send,
    B: Send + 'static,
{
    type Response = Response;
    type Error = Infallible;
    type Future = BoxFuture<'static, Result<Self::Response, Self::Error>>;

    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        self.inner.poll_ready(cx)
    }

    fn call(&mut self, req: Request<B>) -> Self::Future {
        let clone = self.inner.clone();
        let mut inner = mem::replace(&mut self.inner, clone);

        Box::pin(async move {
            let matched_path: MatchedPath = req
                .extensions()
                .get()
                .cloned()
                .expect("Request is missing MatchedPath");

            let resp = inner.call(req).await;

            println!(
                "Request to {} complete: {}",
                matched_path.as_str(),
                resp.as_ref().unwrap().status()
            );

            resp
        })
    }
}

pub struct LogLayer;

impl<S> Layer<S> for LogLayer
where
    S: Debug + Clone,
{
    type Service = LogService<S>;

    fn layer(&self, inner: S) -> Self::Service {
        LogService { inner }
    }
}

#[tokio::test]
async fn bug() {
    let mut router = Router::new()
        .route(
            "/root",
            get(|matched_path: MatchedPath| async move {
                assert_eq!("/root", matched_path.as_str());
            }),
        )
        .merge(Router::new().route(
            "/merged",
            get(|matched_path: MatchedPath| async move {
                assert_eq!("/merged", matched_path.as_str());
            }),
        ))
        .nest(
            "/nested",
            Router::new()
                .route(
                    "/",
                    get(|matched_path: MatchedPath| async move {
                        assert_eq!("/nested/", matched_path.as_str());
                    }),
                )
                .route(
                    "/deeper",
                    get(|matched_path: MatchedPath| async move {
                        assert_eq!("/nested/deeper", matched_path.as_str());
                    }),
                ),
        )
        .layer(LogLayer);

    async fn call(router: &mut Router, uri: &str) {
        let resp = router
            .ready()
            .await
            .unwrap()
            .call(Request::builder().uri(uri).body(Body::empty()).unwrap())
            .await
            .unwrap();
        assert!(resp.status().is_success());
    }

    call(&mut router, "/root").await;
    call(&mut router, "/merged").await;
    call(&mut router, "/nested").await;
    call(&mut router, "/nested/deeper").await;
}

I'm expecting the output to be:

Request to /root complete: 200 OK
Request to /merged complete: 200 OK
Request to /nested complete: 200 OK
Request to /nested/deeper complete: 200 OK

But what I'm getting is:

Request to /root complete: 200 OK
Request to /merged complete: 200 OK
Request to /nested complete: 200 OK
Request to /nested/*__private__axum_nest_tail_param complete: 200 OK

dcormier avatar Oct 02 '22 15:10 dcormier

This is a limitation with how nesting works. Since nested routers internally use wildcards you need to add the layer to the inner router as well.

I have spent quite a bit of time on this in the past and don't believe it's possible to fix.

davidpdrsn avatar Oct 02 '22 16:10 davidpdrsn

The root issue is that when your middleware is called all we know is that the prefix for some nested router matched. Since the middleware is on the outer most router.

We haven't actually called the inner router yet so don't know which route matches in the inner router.

davidpdrsn avatar Oct 02 '22 16:10 davidpdrsn

If MatchedPath was able to be mutated interiorly in the set_matched_path function in RouterService, it seems like that would work. I think MatchedPath::as_str() -> &str wouldn't work, though, and there would need to be a function to get a clone of an internally stored String from a Mutex. Thoughts?

dcormier avatar Oct 03 '22 16:10 dcormier

I think it's a bug that MatchedPath will ever get you something including __private__axum_nest_tail_param, so I've labeled this as such. I'm not really sure what the right behavior would be though...

jplatte avatar Oct 04 '22 18:10 jplatte

If MatchedPath was able to be mutated interiorly in the set_matched_path function in RouterService, it seems like that would work.

I don't think would work. The issue is that we don't know the full matched path when the middleware is called. We just know that it matched something that matched a certain prefix.

We have to actually call that inner service/router to figure out even if it matched something at all. At that point the middleware has already run, so we cannot update the matched path that the middleware got.

We can however update what the handler sees, which we do.

I think it's a bug that MatchedPath will ever get you something including __private__axum_nest_tail_param, so I've labeled this as such. I'm not really sure what the right behavior would be though...

Yeah I dunno what to do either. Not allowing MatchedPath to be extracted at all in middleware seems overly restrictive.

davidpdrsn avatar Oct 04 '22 18:10 davidpdrsn

Deprecate MatchedPath? x)

Or maybe make its extraction fallible and only succeed when match is a route, not a nested service / router?

jplatte avatar Oct 04 '22 18:10 jplatte

Or maybe make its extraction fallible and only succeed when match is a route, not a nested service / router?

We could do that yes.

davidpdrsn avatar Oct 04 '22 19:10 davidpdrsn

I don't think would work. The issue is that we don't know the full matched path when the middleware is called. We just know that it matched something that matched a certain prefix.

I understand that the final matched path isn't known when the middleware is called, but hear me out.

Right now the MatchedPath in the request extensions is replaced as it goes through the router(s). What I'm proposing is that when it the router pulls an existing MatchedPath off the request, it modifies it (interior mutability; struct MatchedPath(Arc<Mutex<String>>)?), without putting a new instance in the request extensions. This way it'll get updated for any middleware that had grabbed it earlier.

I think there would still be a problem of middleware that attempts to get the string version of the path from it before calling the next service wouldn't have the final matched path. But it also seems that MatchedPath can know this (I'm basing that on the existence of that *__private__axum_nest_tail_param string) and there could be a method to indicate the match hasn't completed, or the method to get the string could return Err, or something.

It may not help everyone (those who want the full MatchedPath before calling the next service), but it will help some. It would work in the scenario where I ran into it.

Another idea (possibly in addition to the above), is to put MatchedPath on the response extensions. If it's not a 404, the Router would always have the complete matched route, right?


Maybe the version of MatchedPath with knowledge about whether it's complete is something like:

struct MatchedPath(Arc<Mutex<InnerMatchedPath>>);

struct InnerMatchedPath {
    path: String,
    final: bool,
}

dcormier avatar Oct 07 '22 22:10 dcormier

Ah I see what you mean but tbh I think that'd be very confusing. Then the value of the matched path depends on whether you access the inner str before or after calling the next middleware.

davidpdrsn avatar Oct 07 '22 23:10 davidpdrsn

Yes, I agree. It would be confusing. It seems there could be ways to make it more clear what's going on (based on MatchedPath having knowledge of whether the match has been completed or not), but it would still at least be frustrating.

What about putting the MatchedPath in the extensions on the response? That way it would be accessible to middleware on the way out? That would help some. It would work in my case.

dcormier avatar Oct 07 '22 23:10 dcormier

Yes we could do that. Just needs to be clearly documented because we don't use response extensions anywhere else for user facing things.

But I still think we should do this

Or maybe make its extraction fallible and only succeed when match is a route, not a nested service / router?

davidpdrsn avatar Oct 07 '22 23:10 davidpdrsn

As a user, I would be satisfied with that behavior. Certainly an improvement over how it is now.

dcormier avatar Oct 08 '22 02:10 dcormier

We can do both / all, right? Only put the request extension in for route, not nest, modify the impl FromRequest to reject if the extension is missing instead of panicking.

Possibly add the extension for nested routers somehow, I guess we need something like this or even more complex anyways if we want to do the response extension.

jplatte avatar Oct 08 '22 02:10 jplatte

Yes I think we should do both.

davidpdrsn avatar Oct 08 '22 07:10 davidpdrsn

I wonder, is there any way to extract it in the middleware in the latest version? For example, I want to limit the URL method of each user, what should I do? (According to the document, it is processed in the handler, but this seems to require writing a lot of repetitive code

tingfeng-key avatar Nov 21 '22 05:11 tingfeng-key

There isn't and I don't think that use case can ever work. The solution we are thinking about here is to make the full path available in the response extensions but that doesn't work for auth.

davidpdrsn avatar Nov 21 '22 07:11 davidpdrsn

Is there any way to fix it currently?

tingfeng-key avatar Nov 21 '22 08:11 tingfeng-key

Nope, and like I wrote above I don't think there is a way.

davidpdrsn avatar Nov 21 '22 08:11 davidpdrsn

Well, it is possible for users to not use nest, instead adding the prefix to every route on the sub-router, and using Router::merge to combine it with the top-level router.

jplatte avatar Nov 21 '22 09:11 jplatte

Yes there are work arounds. You can probably also add configuration to your middleware or only add it in some places, if you wanna authorized based on the matched path.

davidpdrsn avatar Nov 21 '22 09:11 davidpdrsn

Yeah, just wanted to make sure that is clear (that there are workarounds). Also we fixed this issue, right? Or am I misremembering? 😄

jplatte avatar Nov 21 '22 09:11 jplatte

We half fixed it. I think we could still make the full matched path available in response extensions so people can use it for logging (suggested here https://github.com/tokio-rs/axum/issues/1441#issuecomment-1272158039)

davidpdrsn avatar Nov 21 '22 09:11 davidpdrsn

Ah, yes.

jplatte avatar Nov 21 '22 09:11 jplatte

extracting MatchedPath in middleware will not get the final matched path for nested routes.

but a workaround is fallback to req.uri().path() (which I currently used in my axum-otel-metrics middleware):

since axum 0.6.0 (or maybe earlier, nested route will not get a /nested/*__private__axum_nest_tail_param anymore), this will make the whole tests pass:

    fn call(&mut self, req: Request<B>) -> Self::Future {
        let clone = self.inner.clone();
        let mut inner = mem::replace(&mut self.inner, clone);

        Box::pin(async move {
            let matched_path = req
                .extensions()
                .get::<MatchedPath>()
                .cloned().map_or_else(||{
                println!("using req uri path: {}", req.uri().path());
                req.uri().path().to_owned()
            }, |matched_path| {
                matched_path.as_str().to_string()
            });

            let resp = inner.call(req).await;

            println!(
                "Request to {} complete: {}",
                matched_path,
                resp.as_ref().unwrap().status()
            );

            resp
        })
    }

ttys3 avatar Nov 26 '22 15:11 ttys3

but a workaround is fallback to req.uri().path()

That's problematic if you have path parameters in your route. MatchedPath would give you a value like /group/:id/members while req.uri().path() would give you something like /group/abcd1234/members.

dcormier avatar Nov 30 '22 15:11 dcormier

but a workaround is fallback to req.uri().path()

That's problematic if you have path parameters in your route. MatchedPath would give you a value like /group/:id/members while req.uri().path() would give you something like /group/abcd1234/members.

yes. it can not get you the route name.

ttys3 avatar Nov 30 '22 16:11 ttys3