opencode icon indicating copy to clipboard operation
opencode copied to clipboard

OpenCode API Race Condition in Concurrent Session Deletion

Open ThomasSteinbach opened this issue 1 month ago • 1 comments

Description

Summary

I discovered a race condition in the OpenCode Server API where concurrent DELETE requests to /session/{id} return success status codes (200/204) but don't reliably persist all deletions. When deleting 5 sessions simultaneously, approximately 30% of attempts leave 1-2 sessions in the system despite all DELETE requests returning success.

Expected vs Actual Behavior

Expected: DELETE requests returning 200/204 should completely remove sessions. Subsequent GET /session should not include deleted sessions.

Actual: Some DELETE requests return success but sessions remain in the system.

Impact

  • Session leaks (ghost sessions consuming resources)
  • Client/server state inconsistency
  • Unreliable batch cleanup operations
  • Flaky tests

OpenCode version

1.0.150

Steps to reproduce

I've created a standalone script that demonstrates the issue:

# Run multiple times to see the ~30% failure rate
for i in {1..10}; do uv run reproduce_concurrent_delete_bug.py; done

The script:

  1. Starts an OpenCode server
  2. Creates 5 test sessions
  3. Deletes all 5 sessions simultaneously using asyncio.gather()
  4. Verifies deletion by listing remaining sessions
  5. Fails when sessions still exist despite successful DELETE responses

Here is the script:

reproduce_concurrent_delete_bug.py
#!/usr/bin/env python3
"""
Reproduction script for OpenCode API concurrent deletion bug.

This script demonstrates a race condition in the OpenCode Server API where
concurrent DELETE requests return success but don't reliably persist all deletions.

Expected behavior: All 5 sessions should be deleted
Actual behavior: ~30% of runs leave 1-2 sessions remaining after "successful" deletion

Requirements:
    pip install httpx

Usage:
    # Run once
    python reproduce_concurrent_delete_bug.py

    # Run multiple times to see intermittent failures (~30% failure rate)
    for i in {1..10}; do python reproduce_concurrent_delete_bug.py; done
"""

import asyncio
import subprocess
import sys
import time
from pathlib import Path

import httpx

# Configuration
OPENCODE_PORT = 3100
OPENCODE_URL = f"http://127.0.0.1:{OPENCODE_PORT}"
NUM_SESSIONS = 5  # Number of sessions to create and delete concurrently


async def wait_for_server(url: str, timeout: int = 10) -> bool:
    """Wait for OpenCode server to be ready."""
    print(f"ā³ Waiting for OpenCode server at {url}...")
    start = time.time()
    async with httpx.AsyncClient() as client:
        while time.time() - start < timeout:
            try:
                response = await client.get(f"{url}/session")
                if response.status_code == 200:
                    print("āœ… Server is ready")
                    return True
            except httpx.ConnectError:
                await asyncio.sleep(0.1)
    return False


async def create_sessions(url: str, count: int) -> list[str]:
    """Create multiple sessions and return their IDs."""
    print(f"\nšŸ“ Creating {count} sessions...")
    session_ids = []

    async with httpx.AsyncClient() as client:
        for i in range(count):
            response = await client.post(f"{url}/session", json={"title": f"test-session-{i}"})
            response.raise_for_status()
            data = response.json()
            session_ids.append(data["id"])
            print(f"   āœ… Created session {i + 1}/{count}: {data['id']}")

    return session_ids


async def delete_sessions_concurrent(url: str, session_ids: list[str]) -> None:
    """Delete all sessions concurrently (this triggers the bug)."""
    print(f"\nšŸ—‘ļø  Deleting {len(session_ids)} sessions CONCURRENTLY...")

    async with httpx.AsyncClient() as client:
        # Delete all sessions at the same time - this is where the bug occurs
        tasks = [client.delete(f"{url}/session/{sid}") for sid in session_ids]
        responses = await asyncio.gather(*tasks)

        # All requests return success status codes
        for i, response in enumerate(responses):
            status = "āœ…" if response.status_code in (200, 204) else "āŒ"
            print(f"   {status} Session {i + 1}: HTTP {response.status_code}")


async def verify_deletion(url: str, expected_deleted_ids: list[str]) -> bool:
    """Verify that all sessions were actually deleted."""
    print(f"\nšŸ” Verifying deletion...")

    async with httpx.AsyncClient() as client:
        response = await client.get(f"{url}/session")
        response.raise_for_status()
        remaining_sessions = response.json()

    # Check which of our sessions still exist
    remaining_ids = {session["id"] for session in remaining_sessions}
    still_exist = [sid for sid in expected_deleted_ids if sid in remaining_ids]

    if still_exist:
        print(
            f"āŒ BUG REPRODUCED: {len(still_exist)} sessions still exist after 'successful' deletion:"
        )
        for sid in still_exist:
            print(f"   - {sid}")
        return False
    else:
        print(f"āœ… All {len(expected_deleted_ids)} sessions deleted successfully")
        return True


async def main() -> int:
    """Main test flow."""
    print("=" * 70)
    print("OpenCode API Concurrent Deletion Bug Reproduction")
    print("=" * 70)

    # Start OpenCode server
    print(f"\nšŸš€ Starting OpenCode server on port {OPENCODE_PORT}...")
    workspace = Path.cwd()
    process = subprocess.Popen(
        ["opencode", "server", "--port", str(OPENCODE_PORT)],
        cwd=workspace,
        stdout=subprocess.DEVNULL,
        stderr=subprocess.DEVNULL,
    )

    try:
        # Wait for server to be ready
        if not await wait_for_server(OPENCODE_URL):
            print("āŒ Server failed to start")
            return 1

        # Create sessions
        session_ids = await create_sessions(OPENCODE_URL, NUM_SESSIONS)

        # Delete all sessions concurrently (this is where the bug happens)
        await delete_sessions_concurrent(OPENCODE_URL, session_ids)

        # Small delay to let any async processing complete
        await asyncio.sleep(0.1)

        # Verify deletion
        success = await verify_deletion(OPENCODE_URL, session_ids)

        print("\n" + "=" * 70)
        if success:
            print("āœ… TEST PASSED: No bug detected in this run")
            print("=" * 70)
            return 0
        else:
            print("āŒ TEST FAILED: Bug reproduced - concurrent deletes lost data")
            print("=" * 70)
            return 1

    finally:
        # Clean up: stop server
        print("\nšŸ›‘ Stopping OpenCode server...")
        process.terminate()
        try:
            process.wait(timeout=5)
        except subprocess.TimeoutExpired:
            process.kill()


if __name__ == "__main__":
    try:
        exit_code = asyncio.run(main())
        sys.exit(exit_code)
    except KeyboardInterrupt:
        print("\n\nāš ļø  Interrupted by user")
        sys.exit(130)
    except Exception as e:
        print(f"\n\nāŒ Unexpected error: {e}")
        sys.exit(1)
pyproject.toml
[project]
name = "opencode-bug-reproduction"
version = "0.1.0"
description = "Reproduction script for OpenCode API concurrent deletion bug"
readme = "README.md"
requires-python = ">=3.9"
dependencies = [
    "httpx>=0.27.0",
]

Screenshot and/or share link

No response

Operating System

macOS (latest)

Terminal

wezterm (latest)

ThomasSteinbach avatar Dec 14 '25 09:12 ThomasSteinbach

This issue might be a duplicate of existing issues. Please check:

  • #4251: Concurrent sessions working on different repos interfere each other
  • #3856: Race condition when applying changes immediately after each other on same file
  • #2882: Parallel edit tool calls on same file will error out
  • #4094: Bug: Permissioned tools in batched calls run out of order
  • #4999: Log File Deleted When Multiple IDE Instances Are Attached to OpenCode Server
  • #5466: Session list jumps to active session when confirming delete on non-current session

These issues share similar patterns of concurrent operations causing data inconsistency or state management problems. Feel free to ignore if your specific case addresses a unique scenario.

github-actions[bot] avatar Dec 14 '25 10:12 github-actions[bot]