agents Tool Call Results Lost During User Interruption Leading to Duplicate Executions

Tool Call Results Lost During User Interruption Leading to Duplicate Executions

Open jiahao6635 opened this issue 1 month ago • 4 comments

## 🐛 Bug Report: Tool Call Results Lost During User Interruption

### Summary
When a user interrupts the agent during or immediately after tool execution, the completed tool calls and their results are not saved to the chat history. This causes the LLM to be unaware that tools have already been executed in the next inference turn, leading to duplicate tool executions.

### Environment
- **livekit-agents version**: 1.1.6 (verified the issue still exists in 1.2.15)
- **Python version**: 3.9+
- **Use case**: Voice agent handling restaurant reservations

### Impact
- **Critical**: In production environments, this causes duplicate operations (e.g., creating duplicate orders)
- **Data integrity**: Multiple identical records created in database
- **User experience**: Users receive incorrect confirmations and duplicated resources

---

### 🔄 Steps to Reproduce

1. **Setup**: Voice agent with a tool that creates resources (e.g., `create_reservation`)
2. **User request**: User initiates a modification flow (e.g., "Change to 6 people")
3. **LLM executes tool**: Agent calls `create_reservation` and successfully creates Order #1
4. **User interrupts**: User speaks (VAD triggered) immediately after tool completion but before agent finishes speaking
5. **LLM continues**: Agent processes user's next input
6. **Result**: Agent calls `create_reservation` again, creating duplicate Order #2

### Expected Behavior
The completed tool call and its result should be saved to chat history even when interrupted, so the LLM knows not to re-execute the same tool.

### Actual Behavior
The tool executes successfully but its result is not recorded in the chat history when interruption occurs, causing the LLM to repeat the tool call.

---

### 📊 Real-World Example

#### Timeline from Production Logs

| Time | Event | Chat History State |
|------|-------|-------------------|
| 14:29:38.705 | User: "我要改成6个人" (Change to 6 people) | - |
| 14:29:50.289 | **LLM outputs tool call**: `create_reservation` | - |
| 14:29:50.583 | **✅ Order created successfully**: NT02011172025102200047 | ❌ Not recorded |
| 14:29:52.063 | **⚠️ User interrupts** (VAD triggered) | Only partial text saved |
| 14:29:53.736 | **LLM inference #2** starts | ❌ Unaware of Order #047 |
| 14:29:54.599 | LLM outputs tool call again: `create_reservation` | - |
| 14:29:59.909 | **💥 Duplicate order created**: NT02011172025102200048 | - |

#### Key Log Excerpts

livekit.agents|14:29:50,584|tool耗时:create_reservation elapsed: 292.86 ms livekit.agents|14:29:52,063|Speech handle 打断 livekit.agents|14:29:52,063|_pipeline_reply_task interrupted position2


**Result**: Two identical orders created with same parameters (6 people, 2025-10-22 15:30:00, same location and decorations)

---

### 🔍 Root Cause Analysis

#### Problem Location

**File**: `livekit-agents/livekit/agents/voice/agent_activity.py`
**Function**: `_pipeline_reply_task()`
**Lines**: ~1577-1593 (v1.1.6) and similar in v1.2.15

#### Current Code (Interrupted Path)

```python
if forwarded_text:
    msg = chat_ctx.add_message(
        role="assistant",
        content=forwarded_text,  # ❌ Only saves partial text
        id=llm_gen_data.id,
        interrupted=True,
        created_at=reply_started_at,
    )
    self._agent._chat_ctx.insert(msg)
    self._session._conversation_item_added(msg)

await utils.aio.cancel_and_wait(exe_task)
return  # ❌ Early return skips tool result saving

Issues:

❌ No tool_calls parameter in the message
❌ No tool result messages (role="tool") created
❌ Early return bypasses the tool saving logic at line ~1607

Why Tools Complete But Results Are Lost

File: livekit-agents/livekit/agents/voice/generation.py Function: _execute_tools_task()

except asyncio.CancelledError:
    # Wait for pending tools to complete
    await asyncio.gather(*tasks)  # ✅ Tools do finish
    # ❌ But results in tool_output.output are never accessed

Tools are protected by asyncio.shield() and complete successfully, but when exe_task is cancelled via cancel_and_wait(), the caller never retrieves tool_output.output.

Flow Diagram

┌──────────────────────────────────────────────┐
│  LLM generates response + calls tools        │
└──────────────┬───────────────────────────────┘
               │
        User interrupts?
        ┌──────┴──────┐
     Yes│             │No
        ↓             ↓
  ┌───────────┐  ┌──────────────┐
  │ INT path  │  │ Normal path  │
  │ • Save    │  │ • Save text  │
  │   text    │  │ • ✅ Save    │
  │ • ❌ Skip │  │   tool calls │
  │   tools   │  │ • ✅ Save    │
  │ • return  │  │   results    │
  └───────────┘  └──────────────┘

I'm willing to contribute this fix!

Oct 22 '25 11:10 jiahao6635

agents agents copied to clipboard

Tool Call Results Lost During User Interruption Leading to Duplicate Executions

Why Tools Complete But Results Are Lost

Flow Diagram

agents
agents copied to clipboard