childprocess icon indicating copy to clipboard operation
childprocess copied to clipboard

Add backend based on Process.spawn

Open eregon opened this issue 4 years ago • 4 comments

From https://github.com/enkessler/childprocess/issues/172

eregon avatar Feb 15 '21 20:02 eregon

Coverage Status

Coverage increased (+2.4%) to 92.727% when pulling 0eff297e3af25449c847716f7b0ac0c7aaebbc49 on eregon:process-spawn into 44227922488765ebad0c0bed0fbec586ef9f5c26 on enkessler:master.

coveralls avatar Feb 15 '21 20:02 coveralls

TravisCI passes, except on JRuby: https://travis-ci.org/github/enkessler/childprocess/builds/759130946

eregon avatar Feb 16 '21 12:02 eregon

TravisCI now passes with the latest JRuby, as good as master + actually it also works on ruby-head: https://travis-ci.org/github/enkessler/childprocess/builds/759183880 vs https://travis-ci.org/github/enkessler/childprocess/builds/759135035

eregon avatar Feb 16 '21 12:02 eregon

This is almost ready now. CI: https://github.com/eregon/childprocess/runs/2869763551?check_suite_focus=true Linux & macOS pass.

Windows has only 2 failures:

Failures:

  1) ChildProcess lets a detached child live on
     Failure/Error: expect(alive?(p_pid)).to_not be true

       expected not #<TrueClass:20> => true
                got #<TrueClass:20> => true

       Compared using equal?, which compares object identity.
     # ./spec/childprocess_spec.rb:213:in `block (2 levels) in <top (required)>'

  2) ChildProcess kills the full process tree
     Failure/Error: raise msg

     RuntimeError:
       timed out after 3 seconds:

       expected: false
            got: true

       (compared using eql?)

       Diff:
       @@ -1 +1 @@
       -false
       +true
     # ./spec/spec_helper.rb:197:in `wait_until'
     # ./spec/childprocess_spec.rb:291:in `block (3 levels) in <top (required)>'
     # ./spec/childprocess_spec.rb:281:in `block (2 levels) in <top (required)>'

Finished in 22.11 seconds (files took 0.83769 seconds to load)
66 examples, 2 failures

The first one might be some issue with Process.detach on Windows. The second is the issue that there is no builtin way to kill a process group on Windows.

JRuby on Linux & macOS has only 1 failure:

   1) ChildProcess can unset env vars
     Failure/Error: expect(child_env).to_not have_key('CHILDPROCESS_UNSET')
       expected `{"ACCEPT_EULA"=>"Y", "AGENT_TOOLSDIRECTORY"=>"/opt/hostedtoolcache", "ANDROID_HOME"=>"/usr/local/lib/...", "XDG_CONFIG_HOME"=>"/home/runner/.config", "_"=>"/home/runner/.rubies/jruby-9.2.19.0/bin/bundle"}.has_key?("CHILDPROCESS_UNSET")` to be falsey, got true
     # ./spec/childprocess_spec.rb:161:in `block in <main>'
     # ./spec/childprocess_spec.rb:150:in `block in <main>'

Which is most likely a bug of JRuby (since that works fine on CRuby).

JRuby on Windows doesn't seem to work well, due to another JRuby bug, Process.waitpid2 seems to return nil for the status, which must be a JRuby bug.

eregon avatar Jun 20 '21 17:06 eregon

Update: CI is green on Linux and macOS. https://github.com/eregon/childprocess/actions/runs/4235205251/jobs/7358538501

On Windows there is a single failure:

   1) ChildProcess kills the full process tree
     Failure/Error: raise msg

     RuntimeError:
       timed out after 3 seconds:

       expected: false
            got: true

       (compared using eql?)

       Diff:
       @@ -1 +1 @@
       -false
       +true
     # ./spec/spec_helper.rb:197:in `wait_until'
     # ./spec/childprocess_spec.rb:291:in `block (3 levels) in <top (required)>'
     # ./spec/childprocess_spec.rb:281:in `block (2 levels) in <top (required)>'

Note that keeping the original code does not solve this. Instead, the original code for Windows seems to have a permission issue, at least in GitHub Actions: https://github.com/eregon/childprocess/actions/runs/4235173639/jobs/7358466488


  1) ChildProcess kills the full process tree
     Failure/Error: ok or raise LaunchError, Lib.last_error_message

     ChildProcess::LaunchError:
       Access is denied. (5)
     # ./lib/childprocess/windows/process_builder.rb:89:in `create_process'
     # ./lib/childprocess/windows/process_builder.rb:34:in `start'
     # ./lib/childprocess/windows/process.rb:70:in `launch_process'
     # ./lib/childprocess/abstract_process.rb:81:in `start'
     # ./spec/childprocess_spec.rb:284:in `block (3 levels) in <top (required)>'
     # ./spec/childprocess_spec.rb:281:in `block (2 levels) in <top (required)>'

Maybe https://stackoverflow.com/a/61113184/388803 is an option.

JRuby on Windows is so broken it can't even load rspec, so I ignore that for now.

I gave it a try to keep the existing logic for JRuby and Windows backends, but that seems to only results in more test failures, so I don't see the point: https://github.com/eregon/childprocess/tree/process-spawn-non-windows https://github.com/eregon/childprocess/actions/runs/4235173639

eregon avatar Feb 21 '23 17:02 eregon

Given the latest release is 4.1.0, I think this change would deserve its own major version, i.e. 5.0.0.

eregon avatar Dec 13 '23 13:12 eregon

Thank you so much on your work on this! :) I don't how many are out there, but I like the interfaces provided by this gem, and depend on it for my private tooling.

kaspergrubbe avatar Dec 19 '23 12:12 kaspergrubbe

This would also solve issues like https://github.com/enkessler/childprocess/issues/186?

With this branch, I was able to use childprocess in the docker image ghcr.io/graalvm/truffleruby-community:23.1.0-debian on my Apple silicon Mac where CHILDPROCESS_POSIX_SPAWN didn't get it done ("posix_spawn is not yet supported on aarch64-linux (aarch64-linux), falling back to default implementation")

dentarg avatar Dec 29 '23 02:12 dentarg

I found a difference between this PR and the latest release: keys in the environment hash now needs to be strings

From docker run -it --rm -v $(pwd):/app -w /app ruby:3.2.2 bash

root@14303433afe8:/app# cat minimal_repro.rb
require "bundler/inline"

gemfile do
  if ENV.key?("PR")
    gem "childprocess", github: "https://github.com/enkessler/childprocess/pull/175"
  else
    source "https://rubygems.org"
    gem "childprocess"
    gem "ffi" if ENV.key?("CHILDPROCESS_POSIX_SPAWN")
  end
end

process = ChildProcess.build("ruby", "--help")
process.environment.merge!({ FOO: "123" })
process.start
root@14303433afe8:/app# ruby minimal_repro.rb
root@14303433afe8:/app# PR=1 ruby minimal_repro.rb
/usr/local/bundle/bundler/gems/childprocess-f7471c8a90c1/lib/childprocess/process_spawn_process.rb:73:in `spawn': no implicit conversion of Symbol into String (TypeError)

        @pid = ::Process.spawn(@environment, *args, options)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	from /usr/local/bundle/bundler/gems/childprocess-f7471c8a90c1/lib/childprocess/process_spawn_process.rb:73:in `launch_process'
	from /usr/local/bundle/bundler/gems/childprocess-f7471c8a90c1/lib/childprocess/abstract_process.rb:81:in `start'
	from minimal_repro.rb:14:in `<main>'

dentarg avatar Jan 04 '24 16:01 dentarg

Actually, that might not be a difference between this PR and the latest release. It is just that this PR is the only way for me to use childprocess with truffleruby and aarch64. I suspect the error above happens with the latest release and CHILDPROCESS_POSIX_SPAWN.

dentarg avatar Jan 04 '24 16:01 dentarg

I suspect the error above happens with the latest release and CHILDPROCESS_POSIX_SPAWN.

No, works fine

$ docker run --rm -it -v $(pwd):/app -w /app ruby:3.2.2 bash
root@da4ec93e9332:/app# CHILDPROCESS_POSIX_SPAWN=1 ruby minimal_repro.rb
root@da4ec93e9332:/app# PR=1 ruby minimal_repro.rb
root@da4ec93e9332:/app# ruby -v
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]

dentarg avatar Jan 04 '24 16:01 dentarg

I'll try to rebase this soon. That missing coercion can easily be solved, although it would be nice to have a test for it.

eregon avatar Jan 04 '24 16:01 eregon

Rebased, and I added a test for Symbol keys and values in process.environment (+ fixed it using the same logic as before).

eregon avatar Jan 04 '24 17:01 eregon

I'm tweaking a bit the CI and self-reviewing, I'll write here when it's ready for review (and also request via GitHub).

eregon avatar Jan 04 '24 17:01 eregon

I suspect the error above happens with the latest release and CHILDPROCESS_POSIX_SPAWN

It doesn't blow up, but childprocess 4.1.0 has a similar problem with CHILDPROCESS_POSIX_SPAWN on TruffleRuby/JRuby, if you pass symbol keys in the env, they wont be used if the ENV already has that key (on CRuby the value you pass is used). https://github.com/sinatra/sinatra/commit/0d5b0f62179b11fa33ddc4c67f64fa3baa1937a7

dentarg avatar Jan 04 '24 18:01 dentarg

It doesn't blow up, but childprocess 4.1.0 has a similar problem with CHILDPROCESS_POSIX_SPAWN on TruffleRuby/JRuby, if you pass symbol keys in the env, they wont be used if the ENV already has that key (on CRuby the value you pass is used).

Not a concern for this PR, as it removes all usage of ffi (🤩) but maybe something you want to look into for TruffleRuby in general @eregon? Or maybe such differences with ffi is expected between CRuby/TruffleRuby.

dentarg avatar Jan 04 '24 18:01 dentarg

This is ready now, the CI tests every Ruby from 2.4 (the required_ruby_version) to 3.3 as well as jruby and truffleruby.

JRuby on Windows fails to load rspec, which is fully independent from this gem so I excluded that. TruffleRuby does not support Windows, so that's also excluded.

I managed to fix the kill process tree feature on Windows by using taskkill, that's a bit heavy but seems reasonable enough for killing an entire process tree. That means this PR should be fully compatible with the existing childprocess, or at least all tests pass, on all platforms.

eregon avatar Jan 04 '24 18:01 eregon

@dentarg

Not a concern for this PR, as it removes all usage of ffi (🤩) but maybe something you want to look into for TruffleRuby in general @eregon? Or maybe such differences with ffi is expected between CRuby/TruffleRuby.

That's because https://github.com/enkessler/childprocess/blob/9981ceef660b7a1d6408b3babf721989a0871886/lib/childprocess/unix/posix_spawn_process.rb#L39 does not do coercion. It's the typical problem of having many backends, they get out of sync. This PR does the environment coercion correctly for all platforms.

eregon avatar Jan 04 '24 18:01 eregon