tower Design for the final form of the `Service` trait

I think now that return type notation (RTN), which allows using async functions in public traits, is making decent progress (see https://github.com/rust-lang/rust/pull/138424 and https://github.com/rust-lang/rust/issues/109417) we should wait for that to be ready and then move to the ultimate and final form of the Service trait in one swoop. We should agree on what it would look like here so that we are ready for when the time comes.

There are two main contenders at the moment: Permits vs Tokens. Both can be blanket implemented for a simpler async call(request: Request) -> Result<Response, Error> trait.

use std::{convert::Infallible, marker::PhantomData};

mod concurrency_limit_token;
mod concurrency_limit_permit;

// a generalization of service which allow back-propagation
pub trait TokenService<Request> {
    type Response;
    type Error;
    type Token;
    // separate error types allow more expressiveness
    type ReadyError;
    type UnreadyError;

    // arm the service
    async fn ready(&mut self) -> Result<Self::Token, Self::ReadyError>;

    // disarm the service
    async fn unready(&mut self, token: Self::Token) -> Result<(), (Self::Token, Self::UnreadyError)>;

    // can't call it call() again otherwise you'd have ambiguity with Service::call()
    async fn call_with_token(
        &mut self,
        request: Request,
        token: Self::Token,
    ) -> Result<Self::Response, Self::Error>;
}

pub trait PermitService<Request> {
    type Response;
    type Error;
    type PermitError;

    type Permit<'a>: Permit<Request, Response = Self::Response, Error = Self::Error>
    where
        Self: 'a;

    async fn ready(&self) -> Result<Self::Permit<'_>, Self::PermitError>;
}

pub trait Permit<Request> {
    type Response;
    type Error;

    async fn call(self, request: Request) -> Result<Self::Response, Self::Error>;
}

pub trait SimpleService<Request> {
    type Response;
    type Error;

    async fn call(&self, request: Request) -> Result<Self::Response, Self::Error>;
}

impl<Request, S> PermitService<Request> for S
where
    S: SimpleService<Request>,
{
    type Response = S::Response;
    type Error = S::Error;
    type PermitError = Infallible;

    type Permit<'a>
        = SimplePermit<'a, Self, Self::Response, Self::Error>
    where
        Self: 'a;

    async fn ready<'a>(&'a self) -> Result<Self::Permit<'a>, Self::PermitError> {
        Ok(SimplePermit {
            simple_service: self,
            response: PhantomData,
            error: PhantomData,
        })
    }
}

pub struct SimplePermit<'a, S, Response, Error> {
    simple_service: &'a S,
    response: PhantomData<Response>,
    error: PhantomData<Error>,
}
impl<'a, Request, S, Response, Error> Permit<Request> for SimplePermit<'a, S, Response, Error>
where
    S: SimpleService<Request, Response = Response, Error = Error>,
{
    type Response = Response;
    type Error = Error;

    async fn call(self, request: Request) -> Result<Self::Response, Self::Error> {
        self.simple_service.call(request).await
    }
}

pub struct SimpleToken;

impl<S, Request> TokenService<Request> for S
where
    S: SimpleService<Request>,
{
    type Response = S::Response;
    type Error = S::Error;
    type Token = SimpleToken;
    type ReadyError = Infallible;
    type UnreadyError = Infallible;

    async fn ready(&mut self) -> Result<Self::Token, Self::ReadyError> {
        Ok(SimpleToken)
    }

    async fn unready(&mut self, _token: Self::Token) -> Result<(), (Self::Token, Self::UnreadyError)> {
        Ok(())
    }

    async fn call_with_token(
        &mut self,
        request: Request,
        _token: Self::Token,
    ) -> Result<Self::Response, Self::Error> {
        self.call(request).await
    }
}

mod tests {
    use super::*;

    struct TestSimpleService;

    impl SimpleService<u8> for TestSimpleService {
        type Response = u8;
        type Error = Infallible;

        async fn call(&self, request: u8) -> Result<Self::Response, Self::Error> {
            Ok(request * 2)
        }
    }

    async fn tests() {
        let mut service = TestSimpleService;

        assert_eq!(
            <TestSimpleService as PermitService::<u8>>::ready(&service)
                .await
                .unwrap()
                .call(2)
                .await
                .unwrap(),
            4
        );

        let token = <TestSimpleService as TokenService<u8>>::ready(&mut service)
            .await
            .unwrap();
        assert_eq!(service.call_with_token(2, token).await.unwrap(), 4);
    }
}

Unresolved design questions:

[x] Is it worth having a different associated for each error type or is that unnecessary complexity. (Personally, I like it for the added expressiveness)
[x] &mut self vs &self in trait methods.
[ ] Do we need the unready() method for disarming or should we rely on the dropping of TokenService::Token for disarmament?

Related issues:

https://github.com/tower-rs/tower/issues/626
https://github.com/tower-rs/tower/issues/412
https://github.com/tower-rs/tower/issues/636
https://github.com/tower-rs/tower/issues/757

Jul 26 '25 17:07 ripytide

I think now that return type notation (RTN), which allows using async functions in public traits, is making decent progress (see rust-lang/rust#109417 and rust-lang/rust#109417)

You posted the same link twice.

Jul 26 '25 17:07 jplatte

You posted the same link twice.

Whoops, fixed.

Jul 26 '25 20:07 ripytide

A couple thoughts:

(1) I don't really understand the semantics of unready as described here. If it is successful it returns a Self::Token? Should it not drop the token if successful and (maybe) return it on failure. Moreover, I'm not quite sure I understand why disarming ought to be fallible in the first place (it seems much more natural to me for disarming to occur when a token is dropped (e.g. a token could contain a semaphore permit)).

(2) I think this version of a token is an improvement over poll_ready, however, it is worth pointing out that this doesn't actually prevent a token issued by one service being passed to a different service of the same type which seems clearly wrong. Putting implementors of Service in charge of detecting this at runtime feels like it defeats the purpose.

Personally, I'm very partial to the general approach laid out in #757, and after porting more of tower's functionality to an approach like that one, I'm a fan.

I largely kept their approach in-tact (taking &self, using a GAT for the Permit, having call solely take the permit by value, not take the service, etc) and built on the middleware they had already ported, however, I reverted their choice to not have these functions return a Result and I chose to split the trait in two like so:

pub trait Service<Request> {
    type Permit<'a>:  Permit<
       Request, 
       Response = Self::Response,
       Error = Self::Error>
    where
        Self: 'a;
    type Response;
    type Error;
    async fn ready(&self) -> Result<Self::Permit<'_>, Self::Error>;
}

pub trait Permit<Request> {
    type Response;
    type Error;
    async fn call(self, request: Request) -> Result<Self::Response, Self::Error>;
}

(I did make the choice here to enforce that ready and call return the same type of error. Several middleware call an inner service's ready inside their call and vice versa, so you either had several more services returning BoxError or defining their own enums. Also you probably could avoid having Response and Error on both Permit and Request, but the alternative has you occasionally writing stuff like. <<<P as Permit<Request>>::Service as Service<Request>>::Response which is horrible.)

On the one hand it is a little unfortunate that the crate's whole tagline is a async fn(request) -> Result<Response, Error> and the trait with the method with this signature isn't Service. On the other hand you maintain the static prevention of passing a service a token it didn't issue while preserving the nice service.ready().await?.call(request).await? syntax. Moreover, once you implement the Dyn versions of these traits as a workaround for async functions not being object safe, you have to split those into two traits anyway because traits with static methods aren't object safe.

Taking an immutable reference to self seems natural, and it also allows you to get rid of the InnerService: Clone bound on several middlewares. (And the handfulof middlewares that need mutability can make do with a Mutex). Permit being a GAT allows you to avoid some heap allocations that some of the current middleware implementations employ (e.g. ConcurrencyLimit) keeps an Arc<Semaphore> so it can store OwnedSemaphorePermits inside of itself to avoid being self-referential. Whereas in this version you can acquire a regular SemaphorePermit<'a> and store it in your ConcurrencyLimitPermit<'a, S>.

I will note that once or twice I have run into some really terse and arcane lifetime errors that seem to be compiler bugs e.g. https://github.com/rust-lang/rust/issues/145127 and https://github.com/rust-lang/rust/issues/100013. However, enabling -Zhigher-ranked-assumptions on nightly always fixed them so hopefully that will be integrated into stable soon? Edit: Also where Self: 'a infects every implementation of Service which is somewhat annoying.

Overall implementing implementing middleware with this interface was quite pleasant.

Aug 22 '25 00:08 cmlsharp

Excellent discussion points.

On point 1, this was an unintentional copy-paste error, indeed it makes no sense for unready() to return the token it should have been async fn unready(&mut self, token: Self::Token) -> Result<(), Self::UnreadyError>; or maybe async fn unready(&mut self, token: Self::Token) -> Result<(), (Self::Token, Self::UnreadyError)>;. The reason I made them results is for expressiveness again, you can always make a result never fail by using Infallible as the error type but you can't go do the reverse.

Point 2 is brilliant, I hadn't though of that, but yes I agree it defeats the purpose. I would despair except you've also given an alternative that seems to fix the issue!

I was unaware of the design in #757, I like the split trait + results version you've provided more though as it's more explicit. Though I would renege on sharing an error type, you can always use the same type by setting both associated types to the same type but enforcing that seems to limit expressiveness. I would suggest:

pub trait Service<Request> {
    type Response;
    type Error;
    type PermitError;

    type Permit<'a>: Permit<Request, Response = Self::Response, Error = Self::Error>
    where
        Self: 'a;

    async fn ready(&self) -> Result<Self::Permit<'_>, Self::PermitError>;
}

pub trait Permit<Request> {
    type Response;
    type Error;

    async fn call(self, request: Request) -> Result<Self::Response, Self::Error>;
}

I've updated my top comment to this design.

Aug 24 '25 13:08 ripytide

Thinking about this more, one unfortunate aspect of the GAT approach here is that to implement e.g. ReadyCache where you have to store a service and the permit that potentially borrows from it, you have to attach some lifetime to that permit. In standard self-referential struct fashion (I say standard, but this requires a lot of care to ensure there is no unsoundness in the interface), you can put the service on the heap and transmute the lifteime of the permit to 'static however, this imposes an undesirable S: 'static bound on the service (because of the where Self: 'a bound in the definition of Service::Permit<'a>). It makes sense why this bound is required with the code as written, but it only arises because the actual lifetime of Permit isn't nameable.

To avoid this, you could infect the definition of the ReadyCache struct with a generic lifetime 't of the permits it stores which is not ideal either. In principle, users of ReadyCache should not have to specify the lifetime of this borrow. Though you could perhaps smooth this over to some extent by having a generic 'lifetime extension' method which safely turns a ReadyCache<'t0, ...> into a ReadyCache<'t1,...> so long as S: 't1.

Some kind of 'unsafe lifetime is really what you want here a la https://github.com/rust-lang/rust/issues/130516 but this is doesn't appear to be anywhere near stabilization.

Aug 29 '25 17:08 cmlsharp

I suppose using GAT tokens means passing tokens to different threads will also be a pain. I wonder if maybe we should go back to the token design as although it required run-time checks to make sure the same service was used for each token, it was more explicit and easier to work with.

On a different topic, I came up with how to make a simple service blanket trait impl for the permit traits:

pub trait SimpleService<Request> {
    type Response;
    type Error;

    async fn call(&self, request: Request) -> Result<Self::Response, Self::Error>;
}

impl<Request, S> Service<Request> for S
where
    S: SimpleService<Request>,
{
    type Response = S::Response;
    type Error = S::Error;
    type PermitError = Infallible;

    type Permit<'a>
        = SimplePermit<'a, Self, Self::Response, Self::Error>
    where
        Self: 'a;

    async fn ready<'a>(&'a self) -> Result<Self::Permit<'a>, Self::PermitError> {
        Ok(SimplePermit {
            simple_service: self,
            response: PhantomData,
            error: PhantomData,
        })
    }
}

pub struct SimplePermit<'a, SS, Response, Error> {
    simple_service: &'a SS,
    response: PhantomData<Response>,
    error: PhantomData<Error>,
}
impl<'a, Request, SS, Response, Error> Permit<Request> for SimplePermit<'a, SS, Response, Error>
where
    SS: SimpleService<Request, Response = Response, Error = Error>,
{
    type Response = Response;
    type Error = Error;

    async fn call(self, request: Request) -> Result<Self::Response, Self::Error> {
        self.simple_service.call(request).await
    }
}

pub struct TestSimpleService;

impl SimpleService<u8> for TestSimpleService {
    type Response = u8;
    type Error = Infallible;

    async fn call(&self, request: u8) -> Result<Self::Response, Self::Error> {
        Ok(request * 2)
    }
}

async fn test() {
    assert_eq!(
        TestSimpleService
            .ready()
            .await
            .unwrap()
            .call(2)
            .await
            .unwrap(),
        4
    );
}

Aug 30 '25 14:08 ripytide

I suppose using GAT tokens means passing tokens to different threads will also be a pain.

Not necessarily. You can create a service that "extends" the lifetime of a permit issued by a service that looks like

 // Extends the lifetime of permits issued by S to 't
pub struct Extend<'t, S> {
    inner: Arc<S>,
    _marker: PhantomData<&'t ()>,
}

This service issues permits of the form:

pub struct ExtendPermit<'t, S, Request>
where S: Service<Request> + 't
{
    // declaration order is important here.
    // permit should be dropped before _svc which it borrows from
    // `inner` is constructed via mem::transmute(_svc.ready().await?)
    inner: S::Permit<'t>,
    // I think(?) we should be fine without a Pin<Arc<S>> 
    // so long as we never expose either member of this struct (even by reference)
    // We don't need S to be pinned forever, we just need to make sure its around until
    // this permit is consumed
    _svc: Arc<S>
}

(you could imagine there's a convenience combinator in the ServiceExt trait for constructing one).

So if you have a S: 'static service, then you can use Extend<'static, S> to produce 'static permits which could be shared across threads. One could imagine having some kind of sub trait (perhaps facilitated by evolving trait hierarchies) for "non-borrowing services" where for<'a> S::Permit<'a> = P for some P.

Here's a playground link to a full example. Some more careful thought would be required to ensure this is actually sound even for very weird instantiations of Permit though.

As above it is unfortunate that these structs even need a 't lifetime at all. If we had unsafe binders, I think you could do something like this:

pub struct Extend<S> {
    inner: Arc<S>,
}

pub struct ExtendPermit<S, Request>
where S: Service<Request>
{
    inner: Option<unsafe<'u> ManuallyDrop<S::Permit<'u>>>,
    _svc: Arc<S>
}

I wonder if maybe we should go back to the token design as although it required run-time checks to make sure the same service was used for each token

Just out of curiosity, what do you envision e.g. the ConcurrencyLimit service looking like with this interface? With the GAT approach, it's quite natural, the permit contains the semaphore permit which is consumed by call. Does your token contain anything or is it just a ZST?

Aug 30 '25 16:08 cmlsharp

I've updated the top comment again to show both contending designs.

Just out of curiosity, what do you envision e.g. the ConcurrencyLimit service looking like with this interface? With the GAT approach, it's quite natural, the permit contains the semaphore permit which is consumed by call. Does your token contain anything or is it just a ZST?

Good question. I'll have a go at an implementation. Here's what I've come up with:

use crate::TokenService;

/// Enforces a limit on the concurrent number of requests the underlying
/// service can handle.
#[derive(Debug)]
pub struct ConcurrencyLimit<S> {
    inner: S,
    sender: flume::Sender<()>,
    receiver: flume::Receiver<()>,
}

impl<S> ConcurrencyLimit<S> {
    pub fn new(inner: S, max: usize) -> ConcurrencyLimit<S> {
        // this could be unbounded or bounded with at least `max` capacity.
        // we choose bounded since it might allow allocation up front
        let (sender, receiver) = flume::bounded::<()>(max);

        for i in 0..max {
            sender.send(());
        }

        ConcurrencyLimit {
            inner,
            sender,
            receiver,
        }
    }
}

impl<S, Request> TokenService<Request> for ConcurrencyLimit<S>
where
    S: TokenService<Request>,
{
    type Response = S::Response;
    type Error = S::Error;
    type UnreadyError = S::UnreadyError;
    type ReadyError = S::ReadyError;
    type Token = S::Token;

    async fn ready(&mut self) -> Result<Self::Token, Self::ReadyError> {
        self.receiver.recv_async().await;

        match self.inner.ready().await {
            Ok(token) => Ok(token),
            Err(err) => {
                self.sender.send_async(()).await;
                Err(err)
            }
        }
    }

    async fn unready(
        &mut self,
        token: Self::Token,
    ) -> Result<(), (Self::Token, Self::UnreadyError)> {
        self.inner.unready(token).await?;

        self.sender.send_async(()).await;

        Ok(())
    }

    async fn call_with_token(
        &mut self,
        request: Request,
        token: Self::Token,
    ) -> Result<Self::Response, Self::Error> {
        let result = self.inner.call_with_token(request, token).await;

        self.sender.send_async(());

        result
    }
}

I like using channels more than mutexes or semaphores but I'd imagine there is an equivalent implementation possible using mutexes/semaphores.

Aug 31 '25 13:08 ripytide

Trying out the implementation of ConcurrencyLimit also made me realize that the blanket trait aproach of SimpleService doesn't work due to conflicting trait implementations that would require specialization. Instead we'd have to switch to a derive macro.

Aug 31 '25 14:08 ripytide