node-pty icon indicating copy to clipboard operation
node-pty copied to clipboard

provide a way to control argv[0] of the spawned process

Open mirabilos opened this issue 4 years ago • 6 comments

To properly fix https://github.com/microsoft/vscode/issues/123332 node-pty must provide a way to control argv[0] of the spawned process, that is, instead of…

const ptyProcess = (await import('node-pty')).spawn("/bin/sh", ["--login"], options);

… which internally does an…

execv("/bin/sh", "/bin/sh", "--login", NULL);

… vscode must do something like…

const ptyProcess = (await import('node-pty')).spawna("/bin/sh", "-sh", [], options);
// or
const ptyProcess = (await import('node-pty')).spawna("/bin/sh", ["-sh"], options);

… which internally does an…

execv("/bin/sh", "-sh", NULL);

… so a new API will be needed, unless this can be put into options somehow, perhaps…

const ptyProcess = (await import('node-pty')).spawna("/bin/sh", [], {"argv0": "-sh"});

mirabilos avatar May 07 '21 23:05 mirabilos

@mirabilos Why would it be needed to treat this separately, why not just use existing API:

pty.spawn('/bin/sh', ['-sh', ...], ...)

jerch avatar May 08 '21 07:05 jerch

jerch dixit:

@.*** Why would it be needed to treat this separately, why not just use existing API:

pty.spawn('/bin/sh', ['-sh', ...], ...)

AIUI in the existing API, the first element of the args array corresponds to argv[1], not argv[0]. At least from reading one of the testcases in this repo this is what I think is the case. So, it must be a new API because changing this would break all existing users.

mirabilos avatar May 08 '21 19:05 mirabilos

@mirabilos Thx, found the piece of code, where it gets copied over as argv[0]. Is that '-' convention of login specced somewhere (maybe in POSIX?) or at least known to work in all major POSIX systems with all shells? Do you have some more insights here, e.g. how does a non shell executable deal with the '-'? How does it get "translated" for the new executable, will it just see the '-' as first char in argv and do additional things on its own (soft convention)? Or are there more changes done from kernel side to the process spawning like implicit **environ changes or switched uid/gid rules or pid/pgid things (hard convention with behavioral changes in outer interfaces)?

API-wise I see 2 possible ways to support it:

  • a boolean flag markAsLoginShell in options, which adds the '-' to argv[0], if it is meant as login shell (smooth upgrade path, but less powerful and semantically slightly wrong if we dont have a shell at all)
  • extend argv to always contain argv[0] explicitly (API breakage, thus works only as major change, more powerful in the long run as it feeds things directly to execvp also allowing for weird interpreter/script execution setups)

jerch avatar May 09 '21 16:05 jerch

Hi jerch,

@.*** Thx, found the piece of code, where it gets copied over as @.***[0]. Is that '-' convention of login specced somewhere (maybe in

@POSIX?) or at least known to work in all major POSIX systems with all @.***?

I don’t have the standard or place in it handy, but I do know that this is the normal behaviour of login, ssh¹, xterm² etc. when they want to make a login shell.

① search for “this is a login shell” in http://cvsweb.openbsd.org/src/usr.bin/ssh/session.c?rev=1.328 ② search for “shname_minus” in http://www.mirbsd.org/cvs.cgi/X11/xc/programs/xterm/main.c?rev=1.5

Do you have some more insights here, e.g. how does a non shell @.*** deal with the '-'?

Most executables don’t inspect argv[0] at all. Multi-call binaries do; these like busybox usually have entries for both "sh" and "-sh" (usually the basename only is inspected). Other multi-call binaries do NOT have special handling for a leading hyphen-minus, for example, I know of a true/false binary that exits 0 if argv[0][0] is ‘t’, 1 if it is ‘f’, so adding this for all executions is not correct. It also should not be added to a shell that invokes a command (like sh -c "echo 123").

How does it get "translated" for the new @., will it just see the '-' as first char in argv and do @. things on its own?

Exactly. In most invocations, argv[0] will just be the name of the program invoked, which can be the shell (often with full pathname, e.g. "/bin/sh") or a script (often relative, "./foo.sh"). Login shells will generally get "-/bin/sh" or "-sh" instead, depending on the caller.

Search for “determine the basename” in http://www.mirbsd.org/cvs.cgi/src/bin/mksh/main.c?rev=1.377 if you’re interested in seeing how mksh does this (basically, if argv[0] and/or basename(argv[0]) begin with a hyphen-minus, FLOGIN is set).

API-wise I see 2 possible ways to support it:

  • a boolean flag loginShell in options, which adds the '-' to argv[0], if it is meant as login shell (smooth upgrade path, but less powerful and semantically slightly wrong if we dont have a shell at all)
  • extend argv to always contain arvg[0] explicitly (API breakage, thus works only as major change, more powerful in the long run as it feeds things directly to execvp also allowing for weird interpreter/script execution)

I see two more, the first of them preferred IMHO:

  • extend opts with a string option that can set argv[0]; continue to use the program path if this option is not passed
  • add a new API, e.g. spawna(), whose argv contains argv[0] explicitly

Both won’t break any existing code.

bye, //mirabilos

„Cool, /usr/share/doc/mksh/examples/uhr.gz ist ja ein Grund, mksh auf jedem System zu installieren.“ -- XTaran auf der OpenRheinRuhr, ganz begeistert (EN: “[…]uhr.gz is a reason to install mksh on every system.”)

mirabilos avatar May 09 '21 20:05 mirabilos

Looking at child_process its API supports an optional argv0 setting in options, which defaults to the given command if not set. So yeah this might be the best way to deal with it:

interface IPtyForkOptions {
  ...
  argv0: undefined | string;
}

jerch avatar May 10 '21 07:05 jerch

jerch dixit:

Looking at child_process its API supports an optional argv0 setting in options:

interface IPtyForkOptions {
 ...
 argv0: undefined | string;
}

which defaults to the given command if not set.

Oh, so this is already there. I hadn’t seen it, but then, I only quickly searched through the code to spot the approximate place.

So yeah this might be the best way to deal with it.

If this works, then, indeed.

Thanks, //mirabilos

FWIW, I'm quite impressed with mksh interactively. I thought it was much much more bare bones. But it turns out it beats the living hell out of ksh93 in that respect. I'd even consider it for my daily use if I hadn't wasted half my life on my zsh setup. :-) -- Frank Terbeck in #!/bin/mksh

mirabilos avatar May 10 '21 16:05 mirabilos