node-pty icon indicating copy to clipboard operation
node-pty copied to clipboard

node-pty hangs after PR #653 (v1.1.0-beta13) when writing multiple chunks rapidly

Open bytemain opened this issue 3 months ago • 8 comments

Note: This issue was generated with AI assistance and has been refined with problem analysis.

Description

After PR #653 ("chore: remove deprecated api process.binding"), node-pty that write multiple data chunks rapidly to a PTY process experience hanging/freezing behavior. This regression was introduced in version 1.1.0-beta13, while 1.1.0-beta12 works correctly.

Image

Root Cause Analysis

PR #653 replaced the custom PipeSocket class (which used process.binding('pipe_wrap')) with Node.js's built-in tty.ReadStream. This change affects the underlying socket behavior and backpressure handling mechanism, causing the application to hang when processing rapid sequential writes.

The key changes in the PR:

  • Removed PipeSocket class that wrapped net.Socket with forced "PIPE" handle type
  • Replaced new PipeSocket(term.fd) with new tty.ReadStream(term.fd) in both fork and open operations

Reproduction Case

Environment:

  • macOS (tested on Apple Silicon)
  • Node.js v20.19.0
  • node-pty v1.1.0-beta13

Reproduction Code:

import * as os from 'node:os';
import * as pty from 'node-pty';

const shell = os.platform() === 'win32' ? 'powershell.exe' : 'zsh';

const ptyProcess = pty.spawn(shell, ['-il'], {
  name: 'xterm-color',
  cols: 80,
  rows: 30,
  cwd: process.env.HOME,
  env: process.env
}) as any;

ptyProcess.onData((data) => {
  process.stdout.write(data);
});

ptyProcess.resize(100, 40);
/**
 * Splits incoming pty data into chunks to try prevent data corruption that could occur when pasting
 * large amounts of data.
 */
export function chunkInput(data: string): string[] {
	const chunks: string[] = [];
	let nextChunkStartIndex = 0;
	for (let i = 0; i < data.length - 1; i++) {
		if (
			// If the max chunk size is reached
			i - nextChunkStartIndex + 1 >= 50 ||
			// If the next character is ESC, send the pending data to avoid splitting the escape
			// sequence.
			data[i + 1] === '\x1b'
		) {
			chunks.push(data.substring(nextChunkStartIndex, i + 1));
			nextChunkStartIndex = i + 1;
			// Skip the next character as the chunk would be a single character
			i++;
		}
	}
	// Push final chunk
	if (nextChunkStartIndex !== data.length) {
		chunks.push(data.substring(nextChunkStartIndex));
	}
	return chunks;
}


const messages = "\"Migrate AI calls from frontend direct calls to backend API with streaming implementation\n\nMain changes:\n- Create new backend AI API interface (api/ai.ts)\n  - Implement streaming AI response endpoint /api/ai/stream\n  - Implement non-streaming AI response endpoint /api/ai/chat (compatibility)\n  - Support Server-Sent Events (SSE) streaming transmission\n  - Add comprehensive error handling and state management\n\n- Update API main file (api/index.ts)\n  - Add AI-related route configuration\n  - Integrate new AI service interfaces\n\n- Create frontend AI service layer (src/services/aiService.ts)\n  - Implement streaming AI response client\n  - Support event-driven streaming data processing\n  - Provide non-streaming API compatibility interface\n  - Add service health check functionality\n\n- Refactor frontend AI call logic (src/pages/Home.tsx)\n  - Replace direct Gemini calls with backend API calls\n  - Implement true streaming response handling\n  - Optimize user experience with real-time streaming display\n  - Maintain original typewriter effects and UI interactions\n\n- Configuration optimization\n  - Update Vite configuration to add API proxy settings\n  - Add dotenv dependency for environment variable loading\n  - Ensure proper frontend and backend environment variable configuration\n\nTechnical improvements:\n- Enhanced security: API keys used only on backend\n- Improved performance: streaming responses reduce wait time\n- Better user experience: real-time AI content display\n- Higher maintainability: separation of frontend and backend responsibilities\"";

const commands = chunkInput(messages);
commands.forEach((command) => {
  ptyProcess.write(command);
});

Steps to Reproduce:

  1. Install node-pty v1.1.0-beta13
  2. Run the above code
  3. The application will hang during the rapid forEach write operations

Expected Behavior:

  • All chunks should be written successfully without hanging (as in v1.1.0-beta12)

Actual Behavior:

  • Application hangs/freezes during rapid sequential writes
  • Process becomes unresponsive and requires manual termination

Version Comparison

  • v1.1.0-beta12: Works correctly, no hanging
  • v1.1.0-beta13: Hangs during rapid writes (after PR #653)

bytemain avatar Aug 28 '25 13:08 bytemain

Image

When using Instruments to analyze the cpu profiler, it shows it hangs on kernel write

bytemain avatar Aug 28 '25 13:08 bytemain

After revert this pr https://github.com/microsoft/node-pty/pull/653, this problem fixed

bytemain avatar Sep 05 '25 02:09 bytemain

@Tyriar Would you please to look at this issue?

bytemain avatar Sep 05 '25 02:09 bytemain

fyi we just hit this same issue and confirmed rolling back to v1.1.0-beta12 fixes it

devm33 avatar Sep 17 '25 18:09 devm33

fyi opened https://github.com/microsoft/node-pty/pull/800 just to do the clean revert

devm33 avatar Sep 18 '25 06:09 devm33

fyi I ended up working around this by throttling the writing to the terminal process. Even with the revert in #800 I still saw corruption of the terminal input -- just not the main thread hang. It looks like the tty.ReadStream: https://github.com/microsoft/node-pty/blob/24b53bbf573ed7ab35f8897aa8b690dbd9064fc9/src/unixTerminal.ts#L207 https://github.com/microsoft/node-pty/blob/24b53bbf573ed7ab35f8897aa8b690dbd9064fc9/src/unixTerminal.ts#L219 that is used to write to write to: https://github.com/microsoft/node-pty/blob/24b53bbf573ed7ab35f8897aa8b690dbd9064fc9/src/unixTerminal.ts#L177-L178 does not emit the 'drain' event (at least as far as I can tell)

devm33 avatar Sep 18 '25 21:09 devm33

Could this potentially manifest as the following error in VS Code?

The connection to the terminal's pty host process is unresponsive, the terminals may stop working.

I'm getting this message every time I try to remote into HPC clusters (3 separate ones!) but not regular workstations. Manually restarting the pty host doesn't work to get the integrated terminal working.

I'm hoping this issue is the root cause.

raj-magesh avatar Sep 22 '25 20:09 raj-magesh

Could this potentially manifest as the following error in VS Code?

The connection to the terminal's pty host process is unresponsive, the terminals may stop working.

I'm getting this message every time I try to remote into HPC clusters (3 separate ones!) but not regular workstations. Manually restarting the pty host doesn't work to get the integrated terminal working.

I'm hoping this issue is the root cause.

Yes, this will cause the pty host unresponsable

bytemain avatar Sep 24 '25 06:09 bytemain

seems already fixed by https://github.com/microsoft/node-pty/pull/831

very appreciate all your team's work

bytemain avatar Dec 13 '25 17:12 bytemain

Thanks for helping spot the duplicate!

Tyriar avatar Dec 13 '25 18:12 Tyriar