claude-code icon indicating copy to clipboard operation
claude-code copied to clipboard

[BUG] Claude CLI hangs at 97% CPU after MCP response processing (Bun/JSC GC contention)

Open numbata opened this issue 2 weeks ago • 1 comments

Preflight Checklist

  • [x] I have searched existing issues and this hasn't been reported yet
  • [x] This is a single bug report (please file separate reports for different bugs)
  • [x] I am using the latest version of Claude Code

What's Wrong?

Claude Code CLI becomes completely unresponsive with sustained 97-115% CPU usage after receiving certain MCP responses. The MCP server remains healthy (0% CPU), confirming the bug is in Claude CLI's response processing, likely due to Bun/JavaScriptCore GC lock contention.

What Should Happen?

Expected Actual
CLI processes response, returns control CLI freezes completely
Normal CPU usage 97-115% CPU sustained
Responsive to input Cannot type or press ESC

Error Messages/Logs

No error messages displayed. Process silently hangs with no output.

Process state during hang:
  PID: 95445
  State: R+ (Running, not blocked)
  CPU: 97-115% sustained
  Memory: 2.7GB

Key stack pattern (from 10s sample, 7050/7050 samples):
  Main thread stuck in:
    → os_unfair_lock_trylock/lock/unlock (lock contention)
    → pthread_getspecific (TLS access)

  7 JavaScriptCore "Heap Helper Threads" active (GC running)

MCP server (chrome-devtools) is healthy at 0% CPU, confirming the issue is in Claude CLI response processing, not the MCP server.

Steps to Reproduce

  1. Start Claude Code with chrome-devtools MCP configured
  2. Use chrome-devtools to interact with a browser page with console output
  3. Execute get_console_message
  4. CLI freezes immediately after the call

Claude Model

Opus

Is this a regression?

I don't know

Last Working Version

No response

Claude Code Version

2.0.76

Platform

Anthropic API

Operating System

macOS

Terminal/Shell

Other

Additional Information

Diagnostic Evidence

Environment

  • Claude Code: 2.0.76
  • OS: macOS 15.7.1 (Darwin 24.6.0)
  • Platform: x86-64
  • Runtime: Bun (JavaScriptCore)

Process States

Claude CLI:         PID 95445, R+ (Running), 97% CPU, 2.7GB RAM
chrome-devtools:    PID 95770, S+ (Sleeping), 0% CPU, 82MB RAM

Stack Trace (10s sample, 7050 samples)

Main thread (100% of samples):
  start (dyld)
    → Bun runtime initialization
      → JavaScriptCore execution
        → [JIT code]
          → os_unfair_lock_trylock/lock/unlock (heavy contention)
          → pthread_getspecific (TLS access, 31+ samples)

Key findings:

  • All 7050 samples in main thread busy-loop
  • Heavy os_unfair_lock_* operations (lock contention)
  • 7 "Heap Helper Threads" active (JavaScriptCore GC)
  • No blocking I/O - process is CPU-bound spinning

MCP Server State (confirms bug is in CLI)

chrome-devtools-mcp:
  Main thread: kevent (waiting for I/O - normal idle)
  V8 Workers: pthread_cond_wait (normal idle)
  CPU: 0%

The MCP server successfully processed the request and is idle, waiting for the next request.

Root Cause Hypothesis

Based on evidence, the issue is in Bun/JavaScriptCore GC interaction during response processing:

  1. get_console_message returns response with large/complex data
  2. JSON parsing triggers heavy memory allocation
  3. JavaScriptCore GC kicks in under memory pressure (2.7GB RSS)
  4. Main thread and GC threads contend for os_unfair_lock
  5. Lock contention creates busy-wait spin loop
  6. Process becomes unresponsive

Note: Bun v1.2.2 fixed "a scheduling issue causing JavaScriptCore's garbage collector timers to not always run" - this may be related.

Proposed Fixes

1. Update Bun Runtime (Immediate)

Update to Bun >= 1.2.2 with GC scheduling fixes.

2. Add Response Processing Timeout (High Priority)

const RESPONSE_TIMEOUT_MS = 30000;

async function processResponse(response: JSONRPCMessage) {
  await Promise.race([
    handleResponse(response),
    new Promise((_, reject) =>
      setTimeout(() => reject(new Error('timeout')), RESPONSE_TIMEOUT_MS)
    )
  ]);
}

3. Add Event Loop Yields (Medium Priority)

// In message processing loop
if (++processedCount % 10 === 0) {
  await new Promise(resolve => setImmediate(resolve));
}

4. Add Response Size Limits (Medium Priority)

const MAX_SIZE = 10 * 1024 * 1024; // 10MB
if (line.length > MAX_SIZE) {
  throw new Error(`Response exceeds max size`);
}

Workarounds

Until fixed, users can:

  1. Clear browser console before MCP calls (console.clear())
  2. Avoid get_console_message on pages with heavy logging
  3. Kill process if memory exceeds 3GB
  4. Set ulimit -v 4000000 before starting Claude

Debug Data

Full diagnostic archive available:

claude_debug_95445_sanitized.tar.gz

Contents:

  • stack_sample_10s.txt - CPU stack sample
  • vmmap.txt - Virtual memory map
  • heap_summary.txt - Memory breakdown
  • chrome_devtools_sample.txt - MCP server stack (confirms idle)
  • Process trees, network state, IPC pipes

Related Issues

  • #1554 - Hanging/Freezing mid-work
  • #4580 - JSON serialization freeze with 100% CPU
  • #6474 - 120% CPU hang, multiple threads stuck
  • #7401 - /resume hogs CPU

Key difference: In this case, the MCP server is still running and healthy. The bug is specifically in response processing, not server disconnect handling.


Additional Details: See attached claude_cli_fix_proposal.md for complete technical analysis and implementation specifications.

numbata avatar Dec 31 '25 18:12 numbata