spago icon indicating copy to clipboard operation
spago copied to clipboard

When `spago run`, if the target cli app manipulates console interface it may not terminate properly

Open edwardw opened this issue 2 years ago • 34 comments

Just use the example from purescript-node-readline repo:

module Main where

import Prelude

import Effect (Effect)
import Effect.Console (log)
import Node.EventEmitter (on_)
import Node.ReadLine (close, createConsoleInterface, lineH, noCompletion, prompt, setPrompt)

main :: Effect Unit
main = do
  interface <- createConsoleInterface noCompletion
  setPrompt "(type 'quit' to stop)\n> " interface
  prompt interface
  interface # on_ lineH \s ->
    if s == "quit" then close interface
    else do
      log $ "You typed: " <> s
      prompt interface

If spago bundle then node index.js, after typing quit the app terminates. But if spago run directly, it hangs. It terminates only after pressing ctrl-d. With spago run --verbose, it can be deduced that the hang must have happened here: src/Spago/Cmd.purs#L59.

This happened with spago version 0.93.9. I also checkout out this repo and built a fresh version from head. It also happened there.

edwardw avatar Oct 07 '23 12:10 edwardw

Also dug a little bit around node documentation. Must be some reason that there exist three flavors of stream piping API stream.pipe, stream.pipeline and the experimental stream.compose. But better leave that to you node experts.

edwardw avatar Oct 07 '23 12:10 edwardw

Hm... This is probably a bug with node-execa. Spago is calling subprocess.stdin.shareParentProcessStdin, so the parent stdin is being shared with the subprocess stdin.

JordanMartinez avatar Oct 07 '23 13:10 JordanMartinez

We're not using the latest execa version - @JordanMartinez do you think that "just" upgrading to latest might fix this?

f-f avatar Oct 07 '23 16:10 f-f

I'm not sure. I'll have to see what goes on in execa.

JordanMartinez avatar Oct 09 '23 16:10 JordanMartinez

So, after I updated Test/Prelude.purs' spago' to use spawn rather than exec, I added the following test. Note: the node-readline code below is v8.1.0, not v7.0.0 in the opening thread.:


    Spec.itOnly "can run interactive programs" \{ spago, spago' } -> do
      spago [ "init" ] >>= shouldBeSuccess
      spago [ "install", "node-readline" ] >>= shouldBeSuccess
      FS.writeTextFile (Path.concat [ "src", "Main.purs" ]) $ Array.intercalate "\n"
        [ "module Main where"
        , ""
        , "import Prelude"
        , ""
        , "import Effect (Effect)"
        , "import Effect.Console (log)"
        , "import Node.EventEmitter (on_)"
        , "import Node.ReadLine (close, createConsoleInterface, closeH, lineH, noCompletion, prompt, setPrompt)"
        , ""
        , "main :: Effect Unit"
        , "main = do"
        , "  interface <- createConsoleInterface noCompletion"
        , "  setPrompt \"(type 'quit' to stop)\\n> \" interface"
        , "  prompt interface"
        , "  interface # on_ closeH do"
        , "    log \"interface was closed\""
        , "  interface # on_ lineH \\s -> do"
        , "    log $ \"You typed: \" <> show s"
        , "    if s == \"quit\" then do"
        , "      log $ \"calling rl.close()\""
        , "      close interface"
        , "      log $ \"called rl.close()\""
        , "    else do"
        , "      log $ \"Looping\""
        , "      prompt interface"
        ]
      spagoProcess <- spago' StdinNewPipe [ "run" ]
      let delay200Ms = Aff.delay $ Milliseconds 200.0
      delay200Ms
      spagoProcess.stdin.writeUtf8 "foo\n"
      delay200Ms
      spagoProcess.stdin.writeUtf8 "bar\n"
      delay200Ms
      spagoProcess.stdin.writeUtf8 "quit\n"
      delay200Ms
      spagoProcess.stdin.writeUtf8End "quit\n"
      delay200Ms
      spagoProcess.result >>= shouldBeSuccess

which outputs the following:

✅ Selecting package to build: 7368613235362d376c5766502b59557a4c3459473333656f2b

Downloading dependencies...
Building...
[1 of 2] Compiling Main
[2 of 2] Compiling Test.Main
           Src   Lib   All
Warnings     0     0     0
Errors       0     0     0

✅ Build succeeded.

(type 'quit' to stop)
> You typed: "foo"
Looping
(type 'quit' to stop)
> You typed: "bar"
Looping
(type 'quit' to stop)
> You typed: "quit"
calling rl.close()
interface was closed
called rl.close()
You typed: "quit"
calling rl.close()
called rl.close()
STDOUT:
(type 'quit' to stop)
> You typed: "foo"
Looping
(type 'quit' to stop)
> You typed: "bar"
Looping
(type 'quit' to stop)
> You typed: "quit"
calling rl.close()
interface was closed
called rl.close()
You typed: "quit"
calling rl.close()
called rl.close()

I noticed that the program hangs if I don't call stream.stdin.end (or the writeUf8End part above). So, maybe we need to call stream.stdin.end after detecting that CTRL+D was pressed?

JordanMartinez avatar Oct 14 '23 21:10 JordanMartinez

After more investigation (and re-reading the opening thread), the issue here seems to be in node-readline. For whatever reason (and as shown above), calling close interface isn't actually closing the interface.

Also, from the Node docs:

Calling rl.close() does not immediately stop other events (including 'line') from being emitted by the readline.Interface instance.

JordanMartinez avatar Oct 14 '23 22:10 JordanMartinez

This might be it. At the very bottom of readline.createInterface(options):

When creating a readline.Interface using stdin as input, the program will not terminate until it receives an EOF character. To exit without waiting for user input, call process.stdin.unref().

JordanMartinez avatar Oct 14 '23 23:10 JordanMartinez

That makes sense, but I still don't get why it behaves differently when run directly with Node vs running it through Spago - it can't just be readline, it sounds like there is something that we are missing in how we set up the streams?

f-f avatar Oct 14 '23 23:10 f-f

Yeah, that's what I'm trying to figure out myself...

JordanMartinez avatar Oct 15 '23 00:10 JordanMartinez

There's seemingly a difference between

Terminal
 |
 +- node run.js

and

Terminal
 |
 +- node spago.js
    |
    +- node run.js

JordanMartinez avatar Oct 15 '23 00:10 JordanMartinez

Regardless, the program written in the opening thread should not need to be changed in order for spago run to terminate once the first quit line is entered. If it works via bundling and node, then it should work via spago run.

So far, my experimentation has shown that the program only terminates if we end the child process' stdin. However, I don't see how spago could know when to end the child process stdin without knowing about the program being run itself.

AFAIK, the child process does not terminate because something is keeping the event loop alive, and that something appears to be the lack of an EOF sent to the child process' stdin in this unique situation where readline is using stdin as its input stream.

Maybe this changes if we run the child process directly via a shell? Otherwise, this might be one of those quirky Node bugs.

On a different note, I recall James Brock hitting something similar with node-streams. Something about stdin never actually emitting an end event. Unfortunately, I can't figure out where that was discussed.

JordanMartinez avatar Oct 15 '23 01:10 JordanMartinez

I'd suggest using one of these three node libraries to see what's causing the child process to stay alive, but they're all written via CJS modules.

JordanMartinez avatar Oct 15 '23 01:10 JordanMartinez

cc @jamesdbrock hopefully this is the same issue you had 😄

f-f avatar Oct 15 '23 07:10 f-f

Tried why-is-node-running:

"use strict"

import { createRequire } from 'module'
const require = createRequire(import.meta.url)
const whyRunningJS = require('why-is-node-running')

export const whyRunningLog = function() {
  whyRunningJS()
}

and

module WhyIsNodeRunning where

import Prelude
import Effect (Effect)

foreign import whyRunningLog :: Effect Unit

Inserting the line whyRunningLog into the original program after close interface, the result is a little bit surprising. If bundling then node index.js, it reports two TickObjects (what's that?):

(type 'quit' to stop)
> quit
There are 2 handle(s) keeping the process running

# TickObject
node:internal/async_hooks:202
node:internal/async_hooks:504
node:internal/process/task_queues:132
node:internal/streams/writable:477
node:internal/stream_base_commons:163
node:internal/stream_base_commons:151
node:internal/streams/writable:392

# TickObject
node:internal/async_hooks:202
node:internal/async_hooks:504
node:internal/process/task_queues:132
node:internal/bootstrap/switches/is_main_thread:259
node:internal/streams/readable:1023
node:internal/readline/interface:543
node:internal/readline/interface:529

and yet, it terminates normally. If spago run directly, there is only one TickObject and it hangs:

(type 'quit' to stop)
> quit
There are 1 handle(s) keeping the process running

# TickObject
node:internal/async_hooks:202
node:internal/async_hooks:504
node:internal/process/task_queues:132
node:internal/bootstrap/switches/is_main_thread:259
node:internal/streams/readable:1023
node:internal/readline/interface:543
node:internal/readline/interface:529

edwardw avatar Oct 15 '23 11:10 edwardw

@edwardw I suspect it's the Node instance in the middle that it's hanging (this one that we launch to run the script), while the child node always terminates properly

f-f avatar Oct 15 '23 11:10 f-f

@edwardw I suspect it's the Node instance in the middle that it's hanging (this one that we launch to run the script), while the child node always terminates properly

Makes sense, that explains that it also sees ctrl-d and then exits gracefully.

edwardw avatar Oct 15 '23 11:10 edwardw

Maybe the hang is caused by execa's _.result waiting for something to finish, and so the resulting Aff never returns?

JordanMartinez avatar Oct 16 '23 09:10 JordanMartinez

But the inner node instance should terminate when asked, no? So at that point the middle one (spawned with execa) should return too. It's as if we need a

cp.on('exit', (code) => {
    process.exit(code);
});

f-f avatar Oct 16 '23 09:10 f-f

My point is that Aff doesn't return until you tell it to. So, if I'm using Aff incorrectly at some point, the child process could have terminated but the corresponding Aff callback isn't called, thereby making the child process appear to hang.

Perhaps the example code should add more listeners to verify whether the child process is terminating or not. And we could add listeners to the child process itself to see what kind of events are being emitted.

JordanMartinez avatar Oct 16 '23 09:10 JordanMartinez

I added listeners to the child process, and can guarantee that it's not emitting an event after it initially spawns once I type 'quit'.

JordanMartinez avatar Oct 16 '23 12:10 JordanMartinez

Found the solution. We need to use inherit in the child process' stdio values when calling spago run. Skip the next paragraph to see how I figured that out.

However... In the execa version currently used in Spago, that's not possible. Moreover, if you updated execa and the Node libraries, we'd have to use the unsafe child process bindings as the safe child bindings assume the stdio value is ["pipe", "pipe", "pipe"]. I'll have to think about how that unsafeness should propagate up into execa.

I first thought we should verify that a node process (like spago) starting another node process (like the program) should or shouldn't terminate when we type 'quit'. This eliminates node-execa and Aff-usage as a possible source of issues.

I bundled the following program via spago to a file called index.js:

module Main where

import Prelude

import Effect (Effect)
import Effect.Console (log)
import Effect.Ref as Ref
import Node.EventEmitter (on_)
import Node.ReadLine (close, createConsoleInterface, closeH, lineH, noCompletion, prompt, setPrompt)

main :: Effect Unit
main = do
  interface <- createConsoleInterface noCompletion
  setPrompt "(type 'quit' to stop)\n> " interface
  prompt interface
  countRef <- Ref.new 0
  interface # on_ closeH do
    log "interface was closed"
  interface # on_ lineH \s -> do
    log $ "You typed: " <> show s
    count <- Ref.read countRef
    case s, count of
      "quit", 0 -> do
        Ref.write 1 countRef
        log $ "calling rl.close()"
        close interface
        log $ "called rl.close()"
      "quit", _ -> do
        log "This should not be printed"
      _, _ -> do
        log $ "Looping"
        prompt interface

I then wrote the following JS program called likeSpago.mjs:

import process from "node:process";
import cp from "node:child_process";

const stdioVal = "pipe";

const subprocess = cp.spawn("node", [ "index.js"], {
  stdio: [stdioVal, stdioVal, stdioVal]
});
subprocess.on("spawn", () => console.log("spawned"));
subprocess.on("exit", () => console.log("exited"));
subprocess.on("closed", () => console.log("closed"));

process.stdin.pipe(subprocess.stdin);
subprocess.stdout.pipe(process.stdout);

I then called node likeSpago.mjs. When stdioVal is pipe, we get the same behavior as above. But when it's inherit, whether or not we remove the stdin/stdout piping code at the bottom, node likeSpago.mjs terminates.

JordanMartinez avatar Oct 16 '23 12:10 JordanMartinez

Man, that's some great digging 😄

I gather the next step is to patch execa to allow this without going the unsafe route?

f-f avatar Oct 16 '23 13:10 f-f

Mm... I'm not sure. The 'unsafeness' is the fact that certain properties don't exist on the child process depending on what values get passed to stdio. Thus, either the type signature of ChildProcess.stdin is forall stdinValue. HasStdin stdinValue => ChildProcess stdinValue -> Readable () or ChildProcess -> Maybe (Readable ()).

Some of this I need to verify personally because I've assumed that writing content to a child process' stdin if the value for that stream was inherit will result in a runtime error (or silently discard the content, leaving the programmer confused as to why nothing was written).

JordanMartinez avatar Oct 16 '23 14:10 JordanMartinez

What I found out yesterday is that as soon as one puts node process in a pipe, its stdin changes from Readable stream to a Socket: just compare node -p "process.stdin" and echo "" | node -p "process.stdin". And there is a related node ticket about no close event emitted when that happens: https://github.com/nodejs/node/issues/25131. There is even a recently resurrected PR trying to fix that. Tried it but it doesn't really fix the problem for spago.

Could be relevant.

edwardw avatar Oct 16 '23 15:10 edwardw

A workaround:

       close interface
+      Stream.destroy Process.stdin

A previous comment mentioned that there is a missing TickObject if spago run directly. That object ultimately leads to streams/writable.js#L760, which is why I propose this workaround. Not pretty, and I'm not sure how practical to apply it in a large code base, but at least it works in this toy example.

edwardw avatar Oct 16 '23 20:10 edwardw

The "fix" is most unusual. I was trying to attack this again by adding this new test to node-execa package:

diff --git a/test/Test/Node/Library/Execa.purs b/test/Test/Node/Library/Execa.purs
index e508ff1..72d369c 100644
--- a/test/Test/Node/Library/Execa.purs
+++ b/test/Test/Node/Library/Execa.purs
@@ -88,6 +88,18 @@ spec = describe "execa" do
             result.timedOut `shouldEqual` true
           _ ->
             fail $ "Timeout should work: " <> show result
+  it "can run interactive programs" do
+    let
+      interactiveJSFile = Path.concat [ "test", "fixtures", "interactive.js" ]
+    spawned <- execa "node" [ interactiveJSFile ]
+      (_ { timeout = Just { milliseconds: Milliseconds 400.0, killSignal: stringSignal "SIGTERM" } })
+    spawned.stdin.writeUtf8 "node\n"
+    spawned.stdin.writeUtf8 "quit\n"
+    result <- spawned.getResult
+    case result.exit of
+      Normally 0 -> result.stdout `shouldContain` "You typed: node"
+      _ ->
+        fail $ (if result.timedOut then "Should not timeout: " else "Other error: ") <> show result
   describe "execaSync" do
     describe "`cat` tests" do
       it "input is file" do

and this js file:

const readline = require('node:readline')
const { stdin: input, stdout: output } = require('node:process')

const rl = readline.createInterface({ input, output })

rl.setPrompt('(type "quit" to stop)\n> ')
rl.prompt()

rl.on('line', (line) => {
  if (line === 'quit') {
    rl.close()
  } else {
    console.log(`You typed: ${line}`)
    rl.prompt()
  }
})

It failed as expected. BUT the following will succeed even being run piped:

...
const rl = readline.createInterface({ input, output })
rl.input._readableState.highWaterMark = 0
...

It has to do when running piped, the underlying Socket stream is buffered by default. It is very ugly hack, should we add it to the readline package?

edwardw avatar Oct 31 '23 23:10 edwardw

It is very ugly hack, should we add it to the readline package?

Probably not. We've already identified a non-hacky fix; why consider this hacky idea? I'm not seeing the advantage.

JordanMartinez avatar Oct 31 '23 23:10 JordanMartinez

It is very ugly hack, should we add it to the readline package?

Probably not. We've already identified a non-hacky fix; why consider this hacky idea? I'm not seeing the advantage.

That's fair. But it at least shows that the purescript side is not necessary at fault here. Instead, the implementation detail of the node side leaks which leads to observable behavior difference from outside.

edwardw avatar Oct 31 '23 23:10 edwardw

[email protected] now allows the stdio array to be configured. So, it allows us to use inherit when calling spago run rather than pipe.

JordanMartinez avatar Nov 14 '23 13:11 JordanMartinez

Blocked by #1122.

JordanMartinez avatar Nov 17 '23 17:11 JordanMartinez