orb-software icon indicating copy to clipboard operation
orb-software copied to clipboard

fix(agentwire): fix infinite loop when no agents enabled

Open TheButlah opened this issue 6 months ago • 2 comments

Prior to this PR, code like the following would exhibit incorrect blocking-in-async behavior, if no agents were enabled.

impl Plan {
    async fn run(&mut self, orb: &mut Broker) -> Result<()> {
        let broker_fut = orb.run(self);
        let tout_fut = tokio::time::sleep(Duration::from_secs(4));
        tokio::select! {
            _ = tout_fut => info!("done sleeping"),
             // blocks forever the first time its polled - this prevents the timeout from ever triggering
            result = broker_fut => result?,
        }

        Ok(())
    }
}

This would happen because if no agents are enabled, the macro-generated poll() function for the future returned by Broker::run() would infinitely loop without ever returning. Now instead, it detects if no agents are enabled and returns Poll::pending, which ensures poll() is always non-blocking.

I've fixed the offending code and added a test case for it.

NOTE: This PR is only a partial fix. There is still an unhandled edge case, where a user defined poll_extra can cause the same infinite-blocking behavior:

impl Broker {
    fn poll_extra(
        &mut self,
        _plan: &mut dyn PlanT,
        _cx: &mut Context,
        _fence: std::time::Instant,
    ) -> Result<Option<Poll<()>>> {
        Ok(None)
    }
}

However, this is not a regression - this also happens prior to this PR. Fixing this would be desirable, but even the partial fix in this PR is useful - therefore I would like to land this PR first and deal with that additional edge case later.

TheButlah avatar Jun 26 '25 17:06 TheButlah

@valff I would like to merge this pr soon, or at least get it reviewed.

TheButlah avatar Aug 08 '25 16:08 TheButlah

@valff could it be the reason for freezes of orb-core on diamond that we saw recently?

AlexKaravaev avatar Aug 27 '25 09:08 AlexKaravaev