Fix Windows worker startup failures from Bun zombie sockets
Bun leaves TCP sockets bound after termination on Windows (#12127), causing EADDRINUSE errors on worker restart. Previously required system reboot.
Changes
ProcessManager.ts
-
cleanupWindowsPort(): Usesnetstatto find PIDs holding port, kills withtaskkill /F -
start(): 3 retry attempts with 2s/4s/6s exponential backoff, calls cleanup between attempts -
waitForHealth(): Scans recent log lines for EADDRINUSE, returns specific error with port number - Port and PID validated before shell execution (command injection prevention)
worker-service.ts
- Detects EADDRINUSE in catch handler, shows port-specific error message
Error Flow
Before:
Attempt 1: EADDRINUSE → Manual intervention required → System reboot
After (Windows only):
Attempt 1: EADDRINUSE
→ netstat -ano | findstr :37777
→ taskkill /F /PID <zombie_pid>
→ wait 2s
Attempt 2: SUCCESS
Non-Windows platforms unaffected. All errors non-fatal with graceful degradation.
Original prompt
This section details on the original issue you should resolve
<issue_title>windows11 64 system</issue_title> <issue_description># Bug Report: Worker Service Fails to Start on Port 37777
🐛 Bug Description
After installing the claude-mem plugin, the mem service starts successfully when running the claude command. However, when Claude Code is running, the mem service fails to start with repeated errors indicating that the worker cannot start on port 37777.
📋 Steps to Reproduce
- Install claude-mem plugin
- Start Claude Code application
- The worker service attempts to start but fails repeatedly
- Error message appears:
Worker failed to start Failed to start server. Is port 37777 in use? - Attempting to restart with
npm run worker:restartresults in:Failed to restart: Process died during startup
❌ Error Messages
From Logs (Image 1):
[2025-12-15 14:22:27.630] [ERROR] [SYSTEM] ⚠️?Worker failed to start Failed to start server. Is port 37777 in use?
[2025-12-15 14:24:23.564] [ERROR] [SYSTEM] ⚠️?Worker failed to start Failed to start server. Is port 37777 in use?
[... repeated multiple times ...]
From Manual Restart Attempt (Image 2):
PS C:\Users\BAIJUN\.claude\plugins\marketplaces\thedotmack> npm run worker:restart
> [email protected] worker:restart
> bun plugin/scripts/worker-cli.js restart
Failed to restart: Process died during startup
From Error Details (Image 3):
Error: Worker service failed to start on port 37777. (port 37777)
To restart the worker:
1. Exit Claude Code completely
2. Open Command Prompt or PowerShell
3. Navigate to: %USERPROFILE%\.claude\plugins\marketplaces\thedotmack
4. Run: npm run worker:restart
5. Restart Claude Code
Stack Trace:
at Y (file:///C:/Users/BAIJUN/.claude/plugins/marketplaces/thedotmack/plugin/scripts/summary-hook.js:16:2074)
at async vt (file:///C:/Users/BAIJUN/.claude/plugins/marketplaces/thedotmack/plugin/scripts/summary-hook.js:20:450)
at async Socket.<anonymous> (file:///C:/Users/BAIJUN/.claude/plugins/marketplaces/thedotmack/plugin/scripts/summary-hook.js:20:1665)
Node.js v22.15.0
✅ Expected Behavior
The worker service should start successfully on port 37777 when Claude Code is running, just as it does when running the claude command.
💻 Environment
-
OS: Windows (based on path:
C:\Users\BAIJUN\.claude\plugins\...) -
Plugin Path:
%USERPROFILE%\.claude\plugins\marketplaces\thedotmack - Claude-mem version: 7.2.1
- Node.js version: v22.15.0
- Platform: Claude Code
🔍 Additional Context
-
Port Conflict: The error suggests port 37777 might be in use. This could indicate:
- Another instance of the worker is already running
- Claude Code and the standalone
claudecommand are conflicting - The port is not being properly released when switching between applications
-
Process Lifecycle Issue: The "Process died during startup" error suggests the worker process is terminating immediately after launch, possibly due to:
- Configuration incompatibility with Claude Code
- Environment variable differences between
claudecommand and Claude Code - Initialization error specific to the Claude Code runtime environment
-
Workaround Attempted: Following the suggested restart procedure did not resolve the issue
🔧 Suggested Investigation Areas
- Check if multiple instances of the worker are running simultaneously
- Verify port 37777 availability when Claude Code starts
- Compare environment variables between
claudecommand and Claude Code contexts - Review worker process startup logs for initialization errors
- Check for differences in how the plugin is loaded in Claude Code vs. CLI
📝 Logs Location
Worker logs should be available at:
-
Path:
%USERPROFILE%\.claude-mem\logs\worker-2025-12-15.log
Would you like me to include additional diagnostic information?</issue_description>
Comments on the Issue (you are @copilot in this section)
A comprehensive chronicle of platform-specific issues, attempted fixes, and architectural decisions.
Executive Summary
The claude-mem project has faced persistent Windows-specific issues centered around three core problems:
- Console Window Popups: Blank terminal windows appearing when spawning worker and SDK subprocess
- Zombie Socket Issues: Bun leaving TCP sockets in LISTEN state after termination on Windows
- Process Management Complexity: Platform-specific spawning logic and reliability issues
These issues have driven multiple PRs, architectural pivots, and significant debate about runtime switching (Bun → Node.js).
Timeline of Issues
Issue thedotmack/claude-mem#209: Windows Worker Startup Failures (Dec 12-13, 2025)
Problem: Worker service failed to start on Windows using PowerShell Start-Process approach.
Symptoms: ...
- Fixes thedotmack/claude-mem#324
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.