PHP_CodeSniffer icon indicating copy to clipboard operation
PHP_CodeSniffer copied to clipboard

[Wiki] Add more information about `parallel` option

Open jrfnl opened this issue 8 years ago • 10 comments

I've just been looking into the parallel running option a bit more and am left with some questions.

  • When should you use this option ?
  • Does it work equally well with phpcs as well as phpcbf ?
  • What things should you take into consideration when determining the amount of parallel processes to use ?
  • What can go wrong ? Are there any type of sniffs which would be incompatible with parallel processing ?
  • Any thing else which behaves differently if parallel processing is turned on ?

It might be useful to add a section to the Wiki Advanced Usage page about this.

jrfnl avatar Oct 21 '17 19:10 jrfnl

I just tried running PHPCS against Customize Posts and get the following results with parallel:

Processes Time
1 0m11.076s
2 0m6.577s
4 0m4.404s
8 0m3.731s
16 0m3.665s

So yeah, wow, it can really speed up processing. There is a logarithmic relationship between parallel process count and the overall time it takes.

westonruter avatar Oct 25 '17 19:10 westonruter

If you're running on a VM, remember to allocate more cores. Using --parallel=16 with 1 core, a phpcs run across WordPress took 1m38.95s. With 4 cores, it took 0m36.3s.

pento avatar Nov 28 '17 04:11 pento

When you write the output into a report the only thing you see on the console is one result for each process. That means that instead of having one output for each file you suddenly only have n outputs (where nis the number of processes up to a certain upper limit it seems).

My concrete use-case was that I set the parallel-value to 75 but only seem to have gotten 73 processes. More interestingly that was exactly the number of files within the top level of my given path… So instead of checking for the parallel-value I spent an hour debugging why phpcs is not parsing the given folder recursively…

Perhaps the output could be changed from a . or E to – lets say a P? To show that this is a process returning and not a file having issues or not…

Or perhaps at least state that information in the (currently not existing) documentation 😉

heiglandreas avatar Jun 22 '18 15:06 heiglandreas

Hi All, Im quite interested in using the --parallel Option when running CI as we have 4 Cores that we can use on our GItlab CI server.

However, if I add --parallel=4 to the command line or add <arg name="parallel" value="4" /> there is no time difference from 1 to 4 cores. Am I doing something wrong?

darthvader666uk avatar Nov 13 '20 09:11 darthvader666uk

@darthvader666uk make sure PHP is compiled with PCNTL support, or the CLI setting wont do anything. Once you have it, you should only see 4 dots in the verbose output instead of 1 per file:

$ phpcs --parallel=100 --no-cache
............................................................  60 / 489 (12%)
............................................................ 120 / 489 (25%)
............................................................ 180 / 489 (37%)
............................................................ 240 / 489 (49%)
............................................................ 300 / 489 (61%)
............................................................ 360 / 489 (74%)
............................................................ 420 / 489 (86%)
............................................................ 480 / 489 (98%)
.........                                                    489 / 489 (100%)

Time: 6.28 secs; Memory: 28MB
$ phpcs --parallel=4 --no-cache
.... 4 / 4 (100%)

Time: 2.45 secs; Memory: 6MB

gsherwood avatar Nov 14 '20 07:11 gsherwood

thank you @gsherwood ! I have enabled PCNTL support in my docker image and worked!

Someone I missed that information! Sorry about that.

darthvader666uk avatar Nov 16 '20 11:11 darthvader666uk

I've been testing this to see if some recommendations can be made about the number of processes to run in parallel, but it seems to be not straightforward. On less powerful machines the process seems to be CPU bound, and I had good success with running --parallel={number_of_cores} but this isn't the case up on machines with new, powerful CPUs and fast storage. On a 32 core machine with a PCIe-4.0 NVMe I could raise the number of parallel processes to 120 before it started to slow down again.

Maybe a good recommendation could be:

A good initial value is the number of available processor cores. If the system has fast storage then this can be increased further.

pfrenssen avatar Jun 25 '21 11:06 pfrenssen

a good initial value is phpcs --parallel=$(nproc)

divinity76 avatar May 23 '22 13:05 divinity76

phpcs --parallel=$(shell nproc || sysctl -n hw.logicalcpu || echo 4) for compatibility with Mac and fallback for anything else (Windows)

mabar avatar May 23 '22 13:05 mabar

@mabar cool! i think the windows equivalent is

powershell "Get-WmiObject Win32_Processor | Select-Object -ExpandProperty NumberOfCores .trim"

maybe

phpcs --parallel=$(shell nproc || sysctl -n hw.logicalcpu || powershell "Get-WmiObject Win32_Processor | Select-Object -ExpandProperty NumberOfCores .trim" || echo 4 )

divinity76 avatar May 23 '22 14:05 divinity76