nativelink icon indicating copy to clipboard operation
nativelink copied to clipboard

Retry logic introduces too much cognitive complexity

Open aaronmondal opened this issue 2 years ago • 2 comments

Spin off from https://github.com/TraceMachina/turbo-cache/pull/324 and I've noticed this in other parts of the codebase as well. The current way to add retry capabilities is this:

self.retrier
    .retry(
        retry_config,
        unfold(thing, move |thing| async move {
            // Do something.
        }),
    )
    .await

That's three levels of nesting where the // Do something can get fairly complex as it basically contains the entire stream/state handling logic. Surely there is a more readable way to do this.

aaronmondal avatar Oct 22 '23 14:10 aaronmondal

Maybe we could hide this away in a macro, but due to rust's borrow checker I'm not sure we can.

.retry() takes in a stream, but you cannot pass ownership of an object into a stream, so we use unfold to take ownership of it and pass the object in and out of the functor.

allada avatar Oct 22 '23 14:10 allada

I came across this: https://docs.rs/tower/latest/tower/retry/trait.Policy.html

Given that event this example is even more complicated, I'm not sure how much more simpler we can make ours.

I want to also point out that the function signature looks more like (notice config is passed in at construction):

self.retrier
    .retry(
        unfold(thing, move |thing| async move {
            // Do something.
        }),
    )
    .await

Given this, we could change this to have the user instead pass a lambda that creates a future that is called (maybe) multiple times and returns Option<Result<T, E>>, but I don't see this any more valuable than what we have. The only difference being we use Stream vs a lamba. The advantage of using a stream is that you are guaranteed that the only one future will ever be in-flight at a given time, allowing the future to access items owned by the stream as mut, where doing it as a lambda that returns a future gives no such guarantee, forcing users to use mutexes or similar concurrency protections.

allada avatar Aug 04 '24 05:08 allada