axum
axum copied to clipboard
Extracting `MatchedPath` in middleware will not always give the final matched path
Bug Report
Version
axum v0.6.0-rc.2
Platform
Darwin 21.1.0 Darwin Kernel Version 21.1.0: Wed Oct 13 17:33:23 PDT 2021; root:xnu-8019.41.5~1/RELEASE_X86_64 x86_64
Description
When extracting MatchedPath in middleware, it may not have the correct value if the request end up being handled by a nested Router.
A simplified example:
use std::{
convert::Infallible,
fmt::Debug,
mem,
task::{Context, Poll},
};
use axum::{
body::Body, extract::MatchedPath, http::Request, response::Response, routing::get, Router,
};
use futures::future::BoxFuture;
use tower::{Layer, Service, ServiceExt};
#[derive(Debug, Clone)]
pub struct LogService<S>
where
S: Debug + Clone,
{
inner: S,
}
impl<S, B> Service<Request<B>> for LogService<S>
where
S: Service<Request<B>, Response = Response, Error = Infallible>
+ Debug
+ Clone
+ Send
+ 'static,
S::Future: Send,
B: Send + 'static,
{
type Response = Response;
type Error = Infallible;
type Future = BoxFuture<'static, Result<Self::Response, Self::Error>>;
fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
self.inner.poll_ready(cx)
}
fn call(&mut self, req: Request<B>) -> Self::Future {
let clone = self.inner.clone();
let mut inner = mem::replace(&mut self.inner, clone);
Box::pin(async move {
let matched_path: MatchedPath = req
.extensions()
.get()
.cloned()
.expect("Request is missing MatchedPath");
let resp = inner.call(req).await;
println!(
"Request to {} complete: {}",
matched_path.as_str(),
resp.as_ref().unwrap().status()
);
resp
})
}
}
pub struct LogLayer;
impl<S> Layer<S> for LogLayer
where
S: Debug + Clone,
{
type Service = LogService<S>;
fn layer(&self, inner: S) -> Self::Service {
LogService { inner }
}
}
#[tokio::test]
async fn bug() {
let mut router = Router::new()
.route(
"/root",
get(|matched_path: MatchedPath| async move {
assert_eq!("/root", matched_path.as_str());
}),
)
.merge(Router::new().route(
"/merged",
get(|matched_path: MatchedPath| async move {
assert_eq!("/merged", matched_path.as_str());
}),
))
.nest(
"/nested",
Router::new()
.route(
"/",
get(|matched_path: MatchedPath| async move {
assert_eq!("/nested/", matched_path.as_str());
}),
)
.route(
"/deeper",
get(|matched_path: MatchedPath| async move {
assert_eq!("/nested/deeper", matched_path.as_str());
}),
),
)
.layer(LogLayer);
async fn call(router: &mut Router, uri: &str) {
let resp = router
.ready()
.await
.unwrap()
.call(Request::builder().uri(uri).body(Body::empty()).unwrap())
.await
.unwrap();
assert!(resp.status().is_success());
}
call(&mut router, "/root").await;
call(&mut router, "/merged").await;
call(&mut router, "/nested").await;
call(&mut router, "/nested/deeper").await;
}
I'm expecting the output to be:
Request to /root complete: 200 OK
Request to /merged complete: 200 OK
Request to /nested complete: 200 OK
Request to /nested/deeper complete: 200 OK
But what I'm getting is:
Request to /root complete: 200 OK
Request to /merged complete: 200 OK
Request to /nested complete: 200 OK
Request to /nested/*__private__axum_nest_tail_param complete: 200 OK
This is a limitation with how nesting works. Since nested routers internally use wildcards you need to add the layer to the inner router as well.
I have spent quite a bit of time on this in the past and don't believe it's possible to fix.
The root issue is that when your middleware is called all we know is that the prefix for some nested router matched. Since the middleware is on the outer most router.
We haven't actually called the inner router yet so don't know which route matches in the inner router.
If MatchedPath was able to be mutated interiorly in the set_matched_path function in RouterService, it seems like that would work. I think MatchedPath::as_str() -> &str wouldn't work, though, and there would need to be a function to get a clone of an internally stored String from a Mutex. Thoughts?
I think it's a bug that MatchedPath will ever get you something including __private__axum_nest_tail_param, so I've labeled this as such. I'm not really sure what the right behavior would be though...
If MatchedPath was able to be mutated interiorly in the set_matched_path function in RouterService, it seems like that would work.
I don't think would work. The issue is that we don't know the full matched path when the middleware is called. We just know that it matched something that matched a certain prefix.
We have to actually call that inner service/router to figure out even if it matched something at all. At that point the middleware has already run, so we cannot update the matched path that the middleware got.
We can however update what the handler sees, which we do.
I think it's a bug that MatchedPath will ever get you something including __private__axum_nest_tail_param, so I've labeled this as such. I'm not really sure what the right behavior would be though...
Yeah I dunno what to do either. Not allowing MatchedPath to be extracted at all in middleware seems overly restrictive.
Deprecate MatchedPath? x)
Or maybe make its extraction fallible and only succeed when match is a route, not a nested service / router?
Or maybe make its extraction fallible and only succeed when match is a route, not a nested service / router?
We could do that yes.
I don't think would work. The issue is that we don't know the full matched path when the middleware is called. We just know that it matched something that matched a certain prefix.
I understand that the final matched path isn't known when the middleware is called, but hear me out.
Right now the MatchedPath in the request extensions is replaced as it goes through the router(s). What I'm proposing is that when it the router pulls an existing MatchedPath off the request, it modifies it (interior mutability; struct MatchedPath(Arc<Mutex<String>>)?), without putting a new instance in the request extensions. This way it'll get updated for any middleware that had grabbed it earlier.
I think there would still be a problem of middleware that attempts to get the string version of the path from it before calling the next service wouldn't have the final matched path. But it also seems that MatchedPath can know this (I'm basing that on the existence of that *__private__axum_nest_tail_param string) and there could be a method to indicate the match hasn't completed, or the method to get the string could return Err, or something.
It may not help everyone (those who want the full MatchedPath before calling the next service), but it will help some. It would work in the scenario where I ran into it.
Another idea (possibly in addition to the above), is to put MatchedPath on the response extensions. If it's not a 404, the Router would always have the complete matched route, right?
Maybe the version of MatchedPath with knowledge about whether it's complete is something like:
struct MatchedPath(Arc<Mutex<InnerMatchedPath>>);
struct InnerMatchedPath {
path: String,
final: bool,
}
Ah I see what you mean but tbh I think that'd be very confusing. Then the value of the matched path depends on whether you access the inner str before or after calling the next middleware.
Yes, I agree. It would be confusing. It seems there could be ways to make it more clear what's going on (based on MatchedPath having knowledge of whether the match has been completed or not), but it would still at least be frustrating.
What about putting the MatchedPath in the extensions on the response? That way it would be accessible to middleware on the way out? That would help some. It would work in my case.
Yes we could do that. Just needs to be clearly documented because we don't use response extensions anywhere else for user facing things.
But I still think we should do this
Or maybe make its extraction fallible and only succeed when match is a route, not a nested service / router?
As a user, I would be satisfied with that behavior. Certainly an improvement over how it is now.
We can do both / all, right? Only put the request extension in for route, not nest, modify the impl FromRequest to reject if the extension is missing instead of panicking.
Possibly add the extension for nested routers somehow, I guess we need something like this or even more complex anyways if we want to do the response extension.
Yes I think we should do both.
I wonder, is there any way to extract it in the middleware in the latest version? For example, I want to limit the URL method of each user, what should I do? (According to the document, it is processed in the handler, but this seems to require writing a lot of repetitive code
There isn't and I don't think that use case can ever work. The solution we are thinking about here is to make the full path available in the response extensions but that doesn't work for auth.
Is there any way to fix it currently?
Nope, and like I wrote above I don't think there is a way.
Well, it is possible for users to not use nest, instead adding the prefix to every route on the sub-router, and using Router::merge to combine it with the top-level router.
Yes there are work arounds. You can probably also add configuration to your middleware or only add it in some places, if you wanna authorized based on the matched path.
Yeah, just wanted to make sure that is clear (that there are workarounds). Also we fixed this issue, right? Or am I misremembering? 😄
We half fixed it. I think we could still make the full matched path available in response extensions so people can use it for logging (suggested here https://github.com/tokio-rs/axum/issues/1441#issuecomment-1272158039)
Ah, yes.
extracting MatchedPath in middleware will not get the final matched path for nested routes.
but a workaround is fallback to req.uri().path() (which I currently used in my axum-otel-metrics middleware):
since axum 0.6.0 (or maybe earlier, nested route will not get a /nested/*__private__axum_nest_tail_param anymore), this will make the whole tests pass:
fn call(&mut self, req: Request<B>) -> Self::Future {
let clone = self.inner.clone();
let mut inner = mem::replace(&mut self.inner, clone);
Box::pin(async move {
let matched_path = req
.extensions()
.get::<MatchedPath>()
.cloned().map_or_else(||{
println!("using req uri path: {}", req.uri().path());
req.uri().path().to_owned()
}, |matched_path| {
matched_path.as_str().to_string()
});
let resp = inner.call(req).await;
println!(
"Request to {} complete: {}",
matched_path,
resp.as_ref().unwrap().status()
);
resp
})
}
but a workaround is fallback to
req.uri().path()
That's problematic if you have path parameters in your route. MatchedPath would give you a value like /group/:id/members while req.uri().path() would give you something like /group/abcd1234/members.
but a workaround is fallback to
req.uri().path()That's problematic if you have path parameters in your route.
MatchedPathwould give you a value like/group/:id/memberswhilereq.uri().path()would give you something like/group/abcd1234/members.
yes. it can not get you the route name.