goose icon indicating copy to clipboard operation
goose copied to clipboard

Tool calls hang indefinitely when confirmations arrive out of order

Open wpfleger96 opened this issue 1 month ago • 1 comments

Problem

Tool calls in goosed hang permanently when multiple concurrent requests receive confirmations out of order. Requires manual pod restarts to recover.

Reproduction

Trigger 3-5+ rapid concurrent tool calls in the same session (e.g., rapid Slack mentions tagging a Goose-powered Slackbot). Confirmations may arrive in different order than requests due to network timing, causing hangs.

Root Cause

Code location: crates/goose/src/agents/tool_execution.rs lines 81-126

let mut rx = self.confirmation_rx.lock().await;
while let Some((req_id, confirmation)) = rx.recv().await {
    if req_id == request.id {
        break; // Found matching confirmation
    }
    // Bug: Non-matching confirmation is silently discarded
}

When confirmations arrive out of order, non-matching confirmations are discarded instead of being queued. Tool requests waiting for those discarded confirmations hang forever.

Race Condition

1. Request #1 locks confirmation channel, starts waiting
2. Request #2 queued for lock
3. Confirmation #2 arrives first (network timing)
4. Request #1 receives Confirmation #2, discards it (ID mismatch)
5. Request #1 gets its confirmation and completes
6. Request #2 acquires lock but its confirmation was already discarded
7. Request #2 hangs forever

wpfleger96 avatar Nov 03 '25 23:11 wpfleger96

good find. we should have a time out on this either way, looks like if the other side never replies, we're also dead

DOsinga avatar Nov 06 '25 19:11 DOsinga