Bash Fork Issues - Investigation Included
Description
OpenCode seems to have bash fork issues at the moment. I run into them a lot, only on opencode. I ran preliminary investigation via OpenCode subagents after experiencing the problem so that the agent had an immediate record of the failures to use; I also gave it access to the VSCode repo to investigate what the two apps are doing differently in regards to this. Below is an MD from OpenCode on its findings including several issues it identified as unique to OpenCode with solutions which should reflect translations of the VScode fixes for OpenCode. I have not had time to try implementing them myself yet however, but I believe this would be a solid start if not the actual solution.
Root Cause #1: Per-Command Process Spawning
Problem: OpenCode spawns a new shell process for every bash command execution.
Current Implementation:
// packages/opencode/src/tool/bash.ts:198-206
const proc = spawn(params.command, {
shell, // New shell EVERY command
cwd,
env: { ...process.env },
stdio: ["ignore", "pipe", "pipe"],
detached: process.platform !== "win32",
})
Impact: 100 commands = 100+ shell processes, each spawn involves fork/exec overhead.
Fixes for Root Cause #1
Fix 1A: Persistent Shell Pool (PTY-based)
Implementation Location: packages/opencode/src/shell/pool.ts (NEW)
Architecture:
┌─────────────────────────────────────────────────┐
│ OpenCode Session │
├─────────────────────────────────────────────────┤
│ ┌─────────────────────────────────────────────┐│
│ │ Shell Pool (1-3 PTYs) ││
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││
│ │ │ PTY │ │ PTY │ │ PTY │ ││
│ │ │ Session│ │ Session│ │ Session│ ││
│ │ └────┬────┘ └────┬────┘ └────┬────┘ ││
│ │ │ │ │ ││
│ │ └───────────┴───────────┘ ││
│ │ │ ││
│ │ Shell Integration ││
│ │ (Command Completion) ││
│ └─────────────────────────────────────────────┘│
└─────────────────────────────────────────────────┘
Implementation:
export class ShellPool {
private pool: Map<string, ShellSession> = new Map()
private maxSize: number = 3
private shell: string
async execute(command: ExecuteOptions): Promise<CommandResult> {
const session = await this.acquire()
if (command.workdir && command.workdir !== session.cwd) {
session.pty.write(`cd ${escapeShellArg(command.workdir)}\n`)
}
session.pty.write(command.command + '\n')
const output = await this.waitForCompletion(session)
this.release(session)
return output
}
private async acquire(): Promise<ShellSession> {
for (const session of this.pool.values()) {
if (session.state === 'idle') {
return session
}
}
if (this.pool.size < this.maxSize) {
return await this.createSession()
}
return new Promise((resolve) => {
this.queue.push(resolve)
})
}
private async createSession(): Promise<ShellSession> {
const id = generateId()
const pty = pty.spawn(this.shell, ShellIntegration.getArgs(this.shell), {
name: 'xterm-256color',
cols: 80,
rows: 24,
cwd: Instance.directory,
env: { ...process.env, ...ShellIntegration.getEnv(this.shell) },
})
const session: ShellSession = {
id,
pty,
cwd: Instance.directory,
env: { ...process.env },
lastUsed: Date.now(),
state: 'idle',
}
this.setupCompletionHandler(session)
this.pool.set(id, session)
return session
}
private async waitForCompletion(session: ShellSession): Promise<CommandResult> {
return new Promise((resolve, reject) => {
const timeout = setTimeout(() => {
session.pty.off('data', handler)
reject(new Error('Command timeout'))
}, 30000)
const handler = (data: string) => {
const events = ShellIntegration.parseSequences(data)
for (const event of events) {
if (event.type === 'command_finish') {
session.pty.off('data', handler)
clearTimeout(timeout)
resolve({ output: data, exitCode: event.data.exitCode })
}
}
}
session.pty.on('data', handler)
})
}
private release(session: ShellSession): void {
session.state = 'idle'
if (this.queue.length > 0) {
const next = this.queue.shift()!
next(session)
}
}
}
Expected Outcome: 97% reduction in process count (100+ → 1-3 per session)
Root Cause #2: No Shell Persistence
Problem: Unlike VSCode which maintains persistent PTY sessions, OpenCode creates isolated processes with no state sharing.
Comparison:
| Aspect | OpenCode | VSCode |
|---|---|---|
| Shell Lifecycle | Spawn per command | Persistent PTY |
| State | None persists | CWD, env, history persist |
| Process Count | 100+ per session | 1-3 per terminal |
Fixes for Root Cause #2
Fix 2A: Session State Manager
Implementation Location: packages/opencode/src/shell/state.ts (NEW)
Implementation:
export namespace StateManager {
const state = Instance.state(
() => new Map<string, SessionState>(),
async (sessions) => {
for (const session of sessions.values()) {
await cleanup(session.id)
}
sessions.clear()
}
)
export function getState(sessionID: string): SessionState {
if (!state().has(sessionID)) {
state().set(sessionID, {
id: sessionID,
cwd: Instance.directory,
env: { ...process.env },
history: [],
permissions: new Set(),
backgroundJobs: [],
timestamp: Date.now(),
})
}
return state().get(sessionID)!
}
export async function updateState(
sessionID: string,
updates: Partial<SessionState>
): Promise<void> {
const current = getState(sessionID)
const newState = { ...current, ...updates }
if (updates.cwd && !validateDirectoryAccess(sessionID, updates.cwd)) {
throw new Permission.RejectedError(
sessionID, 'external_directory', '', { path: updates.cwd }
)
}
if (updates.cwd) {
newState.permissions.add(updates.cwd)
}
newState.timestamp = Date.now()
state().set(sessionID, newState)
}
async function validateDirectoryAccess(
sessionID: string,
path: string
): Promise<boolean> {
const agent = await Agent.get(sessionID)
const state = getState(sessionID)
if (agent.permission.external_directory === 'allow') return true
if (agent.permission.external_directory === 'deny') {
return Filesystem.contains(Instance.directory, path)
}
return state.permissions.has(path) || Filesystem.contains(Instance.directory, path)
}
}
Fix 2B: Shell Integration (State Tracking)
Implementation Location: packages/opencode/src/shell/integration.ts (NEW)
Implementation:
export namespace ShellIntegration {
export function getIntegrationScript(shell: string): string {
const scripts = {
bash: `
if [ -n "$OPENCODE_INTEGRATION" ]; then
__opencode_command_start() { printf '\033]133;C;\007' }
__opencode_command_finished() { printf '\033]133;F;%s\007' "$?" }
__opencode_cwd_changed() { printf '\033]133;D;%s\007' "$PWD" }
export PROMPT_COMMAND='__opencode_command_finished\007'
trap '__opencode_command_start' DEBUG
trap '__opencode_cwd_changed' DEBUG
fi`,
zsh: `preexec() { __opencode_command_start }
precmd() { __opencode_command_finished }
chpwd() { __opencode_cwd_changed }`,
powershell: `
function prompt {
__opencode-Command-Finished $LASTEXITCODE
"PS $PWD> "
}`,
}
return scripts[shell] || scripts.bash
}
export function parseSequences(data: string): ShellEvent[] {
const events: ShellEvent[] = []
const oscPattern = /\x1b]OC;(\d+);(.+?)\x07/g
let match
while ((match = oscPattern.exec(data)) !== null) {
const [fullMatch, code, content] = match
const parts = content.split(';')
if (parseInt(code) === 133) {
const type = parts[0]?.[0]
if (type === 'F') {
events.push({ type: 'command_finish', data: { exitCode: parseInt(parts[1] || '0') } })
} else if (type === 'D') {
events.push({ type: 'cwd_changed', data: { cwd: parts.slice(1).join(';') } })
}
}
data = data.slice(match.index + fullMatch.length)
}
return events
}
}
Expected Outcome: Natural shell experience with persistent CWD, environment, and history.
Root Cause #3: Windows Git Bash Overhead
Problem: On Windows, Git Bash (MSYS2) has significant fork overhead and known DLL initialization issues.
Evidence:
0 [main] bash 1323 dofork: child -1 - forked process 65644 died unexpectedly
/usr/bin/bash: fork: retry: Resource temporarily unavailable
0 [main] bash 1323 child_copy: dll data read copy failed, Win32 error 299
Known MSYS2 Issues:
- Process forking limits on Windows
- DLL initialization failures (0xC0000142)
- Memory allocation failures (0xC0000005)
Fixes for Root Cause #3
Fix 3A: Native Shell Fallback for Windows
Implementation Location: packages/opencode/src/shell/shell.ts
Implementation:
export namespace Flag {
export const OPENCODE_PREFER_NATIVE_SHELL = truthy("OPENCODE_PREFER_NATIVE_SHELL")
}
export const acceptable = lazy(() => {
if (process.platform === "win32" && Flag.OPENCODE_PREFER_NATIVE_SHELL) {
const nativeShells = ["powershell.exe", "cmd.exe"]
for (const shell of nativeShells) {
const path = Bun.which(shell)
if (path) {
log.info("Using native Windows shell", { shell })
return path
}
}
}
const s = process.env.SHELL
if (s && !BLACKLIST.has(path.win32.basename(s))) return s
return fallback()
})
Fix 3B: Command Translation for PowerShell
Implementation:
export namespace ShellCommand {
export function toPowerShell(command: string): string {
return command
.replace(/&&/g, ";")
.replace(/\|\|/g, "; if ($LASTEXITCODE -ne 0) { }")
.replace(/echo\s+(-n)?\s*["']?([^"'\n]+)["']?/gi, 'Write-Host "$2"')
.replace(/\$(\w+)/g, '$env:$1')
}
}
Expected Outcome: Avoid MSYS2 fork issues, 10-100x faster spawn times on Windows.
Root Cause #4: No Retry Logic for Transient Errors
Problem: When fork fails with EAGAIN (errno 11), the command immediately fails without retry.
Current Behavior:
proc.once("error", (error) => {
exited = true
cleanup()
reject(error) // No retry, immediate failure
})
Fixes for Root Cause #4
Fix 4A: Retry Logic with Exponential Backoff
Implementation Location: packages/opencode/src/tool/bash.ts
Implementation:
const MAX_RETRIES = 5
const INITIAL_DELAY = 100
const MAX_DELAY = 5000
const BACKOFF_MULTIPLIER = 2
const RETRYABLE_ERRORS = [
"EAGAIN", // errno 11
"ENOMEM",
"0xC0000142", // Windows DLL init
"0xC0000005", // Windows access violation
]
function isRetryableError(error: Error): boolean {
return RETRYABLE_ERRORS.some(code =>
error.message?.includes(code) ||
error.name?.includes(code)
)
}
async function spawnWithRetry(command: string, options: SpawnOptions): Promise<ChildProcess> {
let lastError: Error | null = null
let delay = INITIAL_DELAY
for (let attempt = 0; attempt < MAX_RETRIES; attempt++) {
try {
return spawn(command, options)
} catch (error: any) {
lastError = error
if (isRetryableError(error)) {
log.warn("Spawn failed, retrying", { attempt, error, delay })
await Bun.sleep(delay)
delay = Math.min(delay * BACKOFF_MULTIPLIER, MAX_DELAY)
} else {
throw error
}
}
}
throw lastError
}
Expected Outcome: Handles transient resource exhaustion, commands succeed instead of failing.
Root Cause #5: No Process Limits
Problem: OpenCode can spawn unlimited processes, potentially hitting OS limits.
Fixes for Root Cause #5
Fix 5A: Shell Semaphore with Queue
Implementation Location: packages/opencode/src/tool/bash.ts
Implementation:
const MAX_CONCURRENT_SHELLS = {
win32: 10,
darwin: 20,
linux: 30,
default: 15,
}[process.platform] || 15
const SHELL_QUEUE_MAX = 50
const SHELL_QUEUE_TIMEOUT = 30000
class ShellSemaphore {
private permits: number
private queue: Array<() => void> = []
constructor(private maxPermits: number) {
this.permits = maxPermits
}
async acquire(): Promise<void> {
if (this.permits > 0) {
this.permits--
return
}
return new Promise((resolve) => {
this.queue.push(resolve)
})
}
release(): void {
this.permits++
if (this.queue.length > 0) {
const next = this.queue.shift()!
this.permits--
next()
}
}
}
const shellSemaphore = new ShellSemaphore(MAX_CONCURRENT_SHELLS)
async function executeWithLimit(params: BashParams): Promise<BashResult> {
const startTime = Date.now()
if (this.queue.length >= SHELL_QUEUE_MAX) {
throw new Error("Shell queue full")
}
await shellSemaphore.acquire()
try {
return await spawnWithRetry(params.command, params.options)
} finally {
shellSemaphore.release()
}
}
Expected Outcome: Prevents hitting OS process limits, predictable resource usage.
Summary: Root Causes → Fixes Mapping
| Root Cause | Problem | Fixes |
|---|---|---|
| #1: Per-command spawning | 100+ processes per session | 1A: Persistent Shell Pool |
| #2: No shell persistence | No state between commands | 2A: State Manager, 2B: Shell Integration |
| #3: Windows Git Bash overhead | MSYS2 fork issues | 3A: Native Shell Fallback |
| #4: No retry logic | EAGAIN fails immediately | 4A: Retry Logic |
| #5: No process limits | Unlimited spawning | 5A: Shell Semaphore |