jest-runner-eslint icon indicating copy to clipboard operation
jest-runner-eslint copied to clipboard

performance mystery of jest --runInBand

Open zhenyulin opened this issue 5 years ago • 10 comments

Currently using jest-runner-eslint to lint src before test in TDD and was trying to speed up the lint process, while controversially I found jest --runInBand delivers better linting performance than linting multiple files in parallel by jest.

So this brings me the question, what is making jest --runInBand actually faster to run jest-runner-eslint?

Some stats:

time npx jest --runInBand:

Test Suites: 15 passed, 15 total
Tests:       15 passed, 15 total
Snapshots:   0 total
Time:        2.374s, estimated 3s
Ran all test suites.
npx jest --runInBand  3.08s user 0.35s system 113% cpu 3.027 total

time npx jest:

Test Suites: 15 passed, 15 total
Tests:       15 passed, 15 total
Snapshots:   0 total
Time:        2.749s, estimated 3s
Ran all test suites.
npx jest  17.84s user 1.69s system 579% cpu 3.368 total

(averagely, it is 10% slower than jest --runInBand)

time npx eslint src:

npx eslint src  2.33s user 0.19s system 115% cpu 2.184 total

(faster)

time npx eslint_d src:

eslint_d src  0.09s user 0.02s system 20% cpu 0.541 total

(super fast when run for not first time)

It would be really nice to support eslint_d to speed up the lint process and get more immediate feedback in TDD.

zhenyulin avatar Sep 06 '18 23:09 zhenyulin

How many cores do you have? In my experience jest requires 4+ to be able to parallelize effectively.

ljharb avatar Sep 07 '18 04:09 ljharb

@ljharb I'm running on a 2.3 GHz Intel Core i7 : )

In the stats above, running jest is using 579% CPU but achieve a slower performance, that really creates a myth for me.

zhenyulin avatar Sep 07 '18 09:09 zhenyulin

It might be a good idea to try to cache the cli instationation based on the config passed (you should get name from it, which is unique): https://github.com/jest-community/jest-runner-eslint/blob/e6ad0601675dfafdc2d9d302fdc57b7f106d4fb9/src/runESLint.js#L13-L15

It might be that simply spinning up a CLIEngine for every single file linted is adding too much overhead.

Not that it should impact runInBand vs not, but still. My guess is that too few files are linted, and the overhead of spawning processes is bigger than the gain of parallelization.

SimenB avatar Sep 15 '18 18:09 SimenB

What kind of black magic is this ? Multithread slower than Monothread ?

raphael22 avatar Dec 02 '19 09:12 raphael22

I think we should be able to memoize the CLIEngine instance.

rogeliog avatar Dec 02 '19 17:12 rogeliog

A lot of time has passed since this issue was opened. Has there been any progress on this? It is a very interesting issue.

rodrigoehlers avatar Mar 30 '21 14:03 rodrigoehlers

Still seeing this, almost 3 years on.

--runInBand cut my test time from 1m10s to 35s. For reference, the official eslint CLI spends around 32s.

rdsedmundo avatar Jun 13 '21 16:06 rdsedmundo

I'm experiencing the same issue and it seems to be because the slowest thing is the module resolution process its done multiple times when not using --runInBand. I will try to investigate if module resolution could be made faster using some configs or somethings

Havunen avatar Dec 27 '22 19:12 Havunen

There are two things that come to mind regarding a slowdown in parallelised execution like this:

  1. Code that is run more (like the FlatESLint check run when runESLint.js is imported for each worker - though that doesn't seem anywhere near enough to cause the large slowdown that is noted here)
  2. Resource contention, like disk access, CPU context switching, multiple processes waiting for exclusive access to the same thing and blocking each other.

(2) feels like the likely culprit in this case, but what resource would the workers be contentious on every time they run. Looking at the runESLint function, they will all want to read the files referenced by config.setupTestFrameworkScriptFile and config.setupFilesAfterEnv and whatever they contain - but I doubt it's that.

It feels like it is more likely down to the ESLint implementation itself - it is not expecting to run in parallel. Each parallel instance is going to be loading the same config, and the same rules, for example. It may also be sharing the same cache for any ASTs (abstract syntax trees) it generates. Maybe that is creating a situation where one worker has to wait for another worker to finish accessing something before it can do its work.

I wonder if it will be useful to see what files are getting accessed and when while things run in band versus in parallel; that may indicate if there is a bottleneck associated with file access.

It could also just be that ESLint startup is slow and when executed in a single worker, the process is able to cache things like the imported files so that the startup is faster on subsequent runs. However, since it's so slow, it makes things MUCH slower when started up n times for n workers. That could be investigated by including some timing output for the main setup code - so we can see how long the first run takes versus subsequent runs within the same worker.

somewhatabstract avatar Aug 29 '23 22:08 somewhatabstract

Lately I have discovered that our application source code has multiple circular references between the file dependencies using import statements. It could be related to that.

Havunen avatar Aug 30 '23 09:08 Havunen