phpunit icon indicating copy to clipboard operation
phpunit copied to clipboard

POC: lightweight subprocess isolation via pcntl_fork()

Open staabm opened this issue 1 year ago • 22 comments

still work in progress..

refs https://github.com/sebastianbergmann/phpunit/issues/5749#issuecomment-1999325702

staabm avatar Mar 16 '24 07:03 staabm

Codecov Report

Attention: Patch coverage is 46.91358% with 86 lines in your changes are missing coverage. Please review.

Project coverage is 89.78%. Comparing base (53eebaa) to head (3628c7b).

Files Patch % Lines
src/Util/PHP/PcntlFork.php 0.00% 82 Missing :warning:
src/Framework/TestRunner.php 40.00% 3 Missing :warning:
src/Framework/TestBuilder.php 96.55% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5751      +/-   ##
============================================
- Coverage     90.13%   89.78%   -0.36%     
- Complexity     6631     6679      +48     
============================================
  Files           697      701       +4     
  Lines         20026    20175     +149     
============================================
+ Hits          18051    18114      +63     
- Misses         1975     2061      +86     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Mar 16 '24 07:03 codecov[bot]

Keep in mind that isolation is more important than performance. I mention this because I do not know how much global state (included files, declared constants, loaded classes, global variables, ...) carries over from the parent process into the child process when forking is used.

sebastianbergmann avatar Mar 16 '24 07:03 sebastianbergmann

In my understanding the goal of process isolation is to isolate the tests from each other.

Is it also the goal to isolate phpunit from the test?

staabm avatar Mar 17 '24 06:03 staabm

Is it also the goal to isolate phpunit from the test?

No.

sebastianbergmann avatar Mar 17 '24 07:03 sebastianbergmann

perf comparison this PR fork@1dc7b80:

➜  phpunit git:(fork) ✗ php ./phpunit tests/_files/TestWithClassLevelIsolationAttributes.php
PHPUnit 11.1-gadccd7efc by Sebastian Bergmann and contributors.

Runtime:       PHP 8.2.17
Configuration: /Users/staabm/workspace/phpunit/phpunit.xml

R                                                                   1 / 1 (100%)

Time: 00:00.006, Memory: 8.00 MB

There was 1 risky test:

1) PHPUnit\TestFixture\TestBuilder\TestWithClassLevelIsolationAttributes::testOne
This test did not perform any assertions

/Users/staabm/workspace/phpunit/tests/_files/TestWithClassLevelIsolationAttributes.php:24

OK, but there were issues!
Tests: 1, Assertions: 0, Risky: 1.

vs. main-branch@4303b9eb2

➜  phpunit git:(main) php ./phpunit tests/_files/TestWithClassLevelIsolationAttributes.php
PHPUnit 11.1-g4303b9eb2 by Sebastian Bergmann and contributors.

Runtime:       PHP 8.2.17
Configuration: /Users/staabm/workspace/phpunit/phpunit.xml

R                                                                   1 / 1 (100%)

Time: 00:00.046, Memory: 8.00 MB

There was 1 risky test:

1) PHPUnit\TestFixture\TestBuilder\TestWithClassLevelIsolationAttributes::testOne
This test did not perform any assertions

/Users/staabm/workspace/phpunit/tests/_files/TestWithClassLevelIsolationAttributes.php:24

OK, but there were issues!
Tests: 1, Assertions: 0, Risky: 1.

POC looks 7-8x faster

staabm avatar Mar 17 '24 19:03 staabm

If a test creates a resource which needs isolation, said test needs to run in isolation.

As soon as a single test runs in the main process with something which has sideeffects on other tests, things break. But thats not unique to forking. You have the same problem with worker processes, as soon as the test which requires isolation runs not in a worker

staabm avatar Mar 22 '24 18:03 staabm

@staabm You're correct that the current process-based isolation mechanism (which uses proc_open()) also has problems with side effects under certain circumstances, but database connections aren't one of them. Database connections aren't inherited when proc_open() is used, but they will with this change, which means it's backwards-incompatible. Creating database connections in the main process before running tests with @runInSeparateProcess is a common pattern in the codebases I work with.

You could prevent breaking backwards compatibility by making this opt-in using a new config option or attribute, but this is a pretty big footgun.

MasonM avatar Mar 22 '24 19:03 MasonM

I see. Thanks for mentioning. I guess if this PR propsal will land, we will definitely need a opt-in until we get some decent real world battle test experience

staabm avatar Mar 22 '24 19:03 staabm

No problem. I'm happy to run any proposed changes on our codebase, which make heavy use of process isolation and would greatly benefit from any performance benefits. I came across this while doing my own research on optimizing such tests, but unfortunately I haven't found anything that helps without major caveats.

MasonM avatar Mar 22 '24 19:03 MasonM

I have just added annotation support, so one can opt-in to fork based isolation via attribute.

example test which uses fork based isolation:

<?php declare(strict_types=1);
/*
 * This file is part of PHPUnit.
 *
 * (c) Sebastian Bergmann <[email protected]>
 *
 * For the full copyright and license information, please view the LICENSE
 * file that was distributed with this source code.
 */
namespace PHPUnit\TestFixture\Metadata\Attribute;

use PHPUnit\Framework\Attributes\RunClassInSeparateProcess;
use PHPUnit\Framework\Attributes\RunInSeparateProcess;
use PHPUnit\Framework\Attributes\RunTestsInSeparateProcesses;
use PHPUnit\Framework\TestCase;

#[RunClassInSeparateProcess(true)]
#[RunTestsInSeparateProcesses]
final class ProcessIsolationForkedTest extends TestCase
{
    #[RunInSeparateProcess]
    public function testOne(): void
    {
    }
}

the true in #[RunClassInSeparateProcess(true)] indicates that fork should be used instead of a separate worker. that way we can easily ship the feature and don't break existing assumptions of worker based isolation per default. (Currently I don't plan to also add phpdoc annotation support, as I read this will be dropped in the next major)

tbh I don't like bool attribute parameters, but I found it would be consistent with other attributes. IMO something like #[RunClassInSeparateProcess("forkIfPossible")] would be easier to read.

this PR is also missing some more end-to-end tests, but I wanted to discuss the UX and design before going into full test-coverage

I am open for feedback

staabm avatar Mar 29 '24 09:03 staabm

@sebastianbergmann any feedback on https://github.com/sebastianbergmann/phpunit/pull/5751#issuecomment-2026922512 ?

staabm avatar Apr 14 '24 08:04 staabm

Currently I don't plan to also add phpdoc annotation support, as I read this will be dropped in the next major

No new annotations are added since PHPUnit 10 came out.

the true in #[RunClassInSeparateProcess(true)] indicates that fork should be used

I am not sure whether I like the boolean flag, I think I would prefer a separate attribute for this. Don't worry, I would implement the attribute and metadata-related changes after merging this.

I think we should resolve #5230 / #5233 before working on this, though. What do you think?

sebastianbergmann avatar Apr 14 '24 08:04 sebastianbergmann

@staabm Heads-up: I have cleaned up the code a bit:

  • Introduced PhpJob value object that encapsulates all relevant information about a job
  • Introduced DefaultPhpJobRunner class for running PhpJobs using the tried-and-true implementation using proc_open() etc.
  • Migrated existing code to use DefaultPhpJobRunner

DefaultPhpJobRunner implements the PhpJobRunner interface. For now, this interface is @internal and no mechanism is in place to configuration an implementation to use instead of DefaultPhpJobRunner. This might change in the future if and when we identify a need for this (sooner if you tell me that you would like to implement your pcntl_fork() based approach that is the topic of this PR as a PhpJobRunner implementation).

sebastianbergmann avatar Apr 18 '24 12:04 sebastianbergmann

thanks for the groundwork. I am planning to bring this pctnl based process isolation into phpunit, if we find a way which is acceptable for you. I think it can be really useful to have a faster isolation alternative.

staabm avatar Apr 18 '24 13:04 staabm

thanks for the groundwork. I am planning to bring this pctnl based process isolation into phpunit, if we find a way which is acceptable for you. I think it can be really useful to have a faster isolation alternative.

Okay, then I'll try to find time ASAP to make this pluggable. I already how to do it, just need to do it.

sebastianbergmann avatar Apr 18 '24 13:04 sebastianbergmann

thanks. no hurry.

pcntl based isolation is not a blocker for me... its just one item on my bucket list to make the world a bit better :-)

staabm avatar Apr 18 '24 14:04 staabm

As of 191477f0584897c9b748208bdc92a52d42b3c077, PhpJobRunnerRegistry::set() can be used to configure a custom PhpJobRunner implementation.

sebastianbergmann avatar Apr 19 '24 06:04 sebastianbergmann

thx for your work. I tried rebasing the PR, but I am not sure yet, whether we have all pieces in the JobRunner to implement a pcntl variant.

I need a way to run the job in the current process like I did in

https://github.com/sebastianbergmann/phpunit/pull/5751/files#diff-2843ca5c5620803be3ae210b4cd8f4f74eac269d29f3fef343ce212a3aed05f1R92-R119

but it seems PhpJob alone does not provide enough context so I can run the TestCase. Maybe I missed something..?

staabm avatar Apr 21 '24 08:04 staabm

Maybe I missed something?

I doubt that you are missing something. The PhpJob value object currently only encapsulates the information that the default implementation of PhpJobRunner needs. What exactly do you need?

sebastianbergmann avatar Apr 28 '24 06:04 sebastianbergmann

in a PcntlJobRunner I would need to fork the current process.

after forking

  • in the main process I would need to wait for the forked worker process to end and report the result
  • in the child process I would need to run the test itself, report the result to the main process and kill the child

all the above - forking, main and child logic - would happen within the JobRunner. currently I see no way to run the test-case from within a JobRunner. I think therefore I need a reference to TestCase within my PcntlJobRunner.

staabm avatar May 04 '24 05:05 staabm

@staabm Would 634254224e59cb3e8f50105a2238b59abb3c8c1f help?

sebastianbergmann avatar May 06 '24 12:05 sebastianbergmann