eslint
eslint copied to clipboard
Lint multiple files in parallel [$500]
This is a discussion issue for adding ability to run eslint in parallel for multiple files.
The idea is that ESLint is mostly CPU bound, not IO bound, so creating multiple threads (for machine with multiple cores) might (and probably will) increase performance in a meaningful way. The downside is that currently ESLint's codebase is synchronous. So this would require rewriting everything up to and including eslint.js to be asynchronous, which would be a major effort.
I played with this a little while ago and found a few libraries for Node that handle thread pool, including detection of number of cores available on the machine.
- Node-threads-a-gogo - seems pretty good, but looks dead.
- nPool - seems actively in development, but has native components (C++)
- Node WebWorkers - seems pretty dead too.
- Parallel - seems dead, and no pool implementation.
- Node Clusters - not stable yet, and probably isn't going to be available on Node v0.10
- WebWorkers - seems that they are only implemented in io.js And there are a ton of other libraries out there for this.
If anyone had any experience writing multithreaded applications for node.js and would like to suggest alternatives or comment on the above list, please feel free.
P.S. https://www.airpair.com/javascript/posts/which-async-javascript-libraries-should-i-use
Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.
Nice. I'm very interested in trying this myself too.
Another question that I had in my mind, if we need to rewrite everything to be async, should we use callback pattern? Promises? If so which library? Q or Bluebird? I personally would prefer promises to callback hell.
I vote for promises. Bluebird is fast, but makes me nervous because it adds methods to the native Promise
, which is likely not a great idea. I think Q might be the best bet. There is also Asequence, but I have no personal experience with it.
Why not use built in promises? Just a question as I have no experience with promises yet.
They're not supported in node 0.10, to my knowledge.
Besides that, the libraries give some nice "sugar" methods when working with Promises.
I've had plenty of success using native promises (or a polyfill when native promises aren't supported). That seems like a good starting point to me; if we need more than they provide we could probably swap out something that's API-compatible.
I think we're putting the cart before the horse here. Let's hold off on promises vs. callbacks until we're at least ready to prototype. Get something working with callbacks and let's see how bad it is (or not).
The idea is that ESLint is mostly CPU bound, not IO bound
ESLint also does a lot of IO (directory traversal, reading source files). So I think we would also profit here if we rewrite eslint to do non-blocking IO.
@lo1tuma I haven't profiled it yet, but in my mind, amount of IO we do is negligible comparing to amount of CPU cycles we eat. I will try to profile it and post results here if I will get anything meaningful.
Using something like NodeClusters - or most other per-process implementations - would avoid the issue of needing to [vastly] rewrite ESLint. (Such implementations are strictly not threading, but allow managed parallel process execution.)
It would mostly just need to IPC-in/out the current ESLint; ESLint parallelism would then be able to work freely over different files in a per-process manner, but it could not (without more rework) run concurrently over a single file.
Thus if the goal is to run ESLint over different files in parallel I would urge such a 'simple' per-process concurrency approach. If the goal is to make ESLint parallel across the same source/AST then .. that is a more complicated can of worms as it changes the 'divergence' point.
If there is a strict v0.10 node target for ESLint, maybe have this as an feature when running a compatible node version.
My idea is:
- The master process does queue control and collecting result.
- Worker processes have a short queue (the length is about 2-4).
A worker does:
- Sends the result of the last file and requests a path of the next next next file to the master.
- Reads asyncly the next next file.
- Lints the next file.
There are source codes which has a variety of size, so I think the queue control in pull-style is important.
Library.... I think child_process.fork
is enough.
Sorry for this question: Based on all the comments, do we think the reward of this functionality is worth the effort/changes/complications? (like I said just a question) Or is this functionality too early (like google glass) to implement without any actual estimates or ask or whatever.
@gyandeeps My projects are not big enough, or computer slow enough, for me to really care either way.
In cases where there are sizable projects of many files, on computers with several cores and non-bound I/O, I imagine that it could lead to significantly reduced wall-clock, approaching Amdal's law. I would be less optimistic about this gain with fewer larger files, even with 'actual threading' or post-AST handling - but that is what performance profiles are for.
Of course another option is to only lint 'changed' files and provide some sort of result cache, but comes with additional data management burdens.
@gyandeeps To answer your question - we do not have definitive information on this. Right now my assumption is that we are CPU-bound, not IO. In that case utilizing more cores should have significant impact on larger projects (as @pnstickne mention, impact will be small or there might even be negative impact on a few large files). I think the first step of the process would be to prove or disprove my assumption on CPU vs IO. Granted, if it turns out I'm wrong and we are IO-bound, then changing ESLint code to async would improve performance anyways.
@pnstickne Thanks for the insights. I'm really not familiar with NodeClusters, just know that they exist and do not require everything to be async for them to act. We definitely not going to try to make ESLint run AST analysis in parallel, that's going to be a huge change, and a lot of our rules expect nodes to be reported in the right order, so we would not only need to rewrite pretty much the whole core, we would also need to rewrite a lot of rules. I would be interested in learning more how we could incorporate code that would only execute on node 0.12 and not on earlier version. I'm not sure that's the approach I would like to take, but it's an interesting option never the less.
So I did some very unscientific performance analysis on both Windows and OSX. I specifically chose to lint a very large number of files (TodoMVC, 1597 files, 854 folders). On Windows, results where very inconclusive. Basically my CPU usage was never over 15% (on 8 core machine, none of the cores were overtaxed at any point). But then my I/O never hit anything over 10MB/s on a SSD that's theoretically capable of 550 MB/s. So I have no idea why it didn't run any faster. On OSX it never hit more then 15% CPU (I have no idea how many cores my macbook has, probably 8 too), but I/O was pretty taxed. It looked like there were a few spikes that were reaching 100% disk throughput. So maybe my assumption was wrong any we are not CPU bound?
@ilyavolodin Try this:
Start running 8 different eslint processes (over the same set of files should be fine, although it would be 'more fair' to break up the files into 8 different equally-sized sets).
Compare the wall-clock times it takes for 1 process to complete and 8 processes to complete.
The 8 processes will have done 8x the work (if using the same set of files for each process as for the single process) or the same amount of work (if having split the source files among them), but in how much x the time?
This very crudely should show an approximate gain - if any - for using multi-process concurrency.
Late to the conversation, but... Is anyone opposed to someone starting up a pull request to implement the callback hell approach (i.e., make ESLint async across the board, including for all I/O operations)? Seems like it would make the eventual parallelization easier anyway.
Of course another option is to only lint 'changed' files and provide some sort of result cache, but comes with additional data management burdens. -@pnstickne
This is essentially what ESLinter from @royriojas does.
@IanVS That's pretty cool .. now if only it was built-in to my eslint grunt task :}
(Okay, I could get it shimmed in pretty easy - but it'd still be nice to see a 'done package'.)
@pnstickne #2998 @platinumazure I think I would like to first prove that there's something to gain by going async/multithreaded just so nobody would waist their time creating a very large PR that will then be closed, because there is no gain.
@ilyavolodin That's fair enough. I'm wondering, though, would it be worth creating a separate issue with the goal of making ESLint's I/O operations asynchronous? Synchronous I/O is a bit of an anti-pattern in Node and it might help us figure out just how much of a breaking change parallelization would likely be. On Aug 30, 2015 9:46 AM, "Ilya Volodin" [email protected] wrote:
@pnstickne https://github.com/pnstickne #2998 https://github.com/eslint/eslint/issues/2998 @platinumazure https://github.com/platinumazure I think I would like to first prove that there's something to gain by going async/multithreaded just so nobody would waist their time creating a very large PR that will then be closed, because there is no gain.
— Reply to this email directly or view it on GitHub https://github.com/eslint/eslint/issues/3565#issuecomment-136151391.
@platinumazure While sync code is node anti-pattern, in our case, we can't do anything with the code (as in parse it into AST) until we read the whole file. So if improving performance is not on the plate, then changing code to async will increase complexity of the code, but not gain us anything. It's worth testing out and I still think there's a performance gain that we can get by doing parallel execution, but I would like to get some proof of that first.
@ilyavolodin reading files asynchronously doesn’t mean you read them chunk-by-chunk. While you are waiting for one file to be read, you can lint a different file which has been read already.
Wall clock time is the only metric that matters for this evaluation. We have had people report they were using ESLint on projects with thousands of files, so if there's a benefit, it will be seen.
The thing to keep in mind is that directory traversal must be done before linting can begin. Thus, we get zero benefit from switching to an async directory traversal. We need to build up the list of files to lint and then fork off to start linting, at which point each process is doing a file read, and there's no benefit to async there either.
Overall, I'd just like to squash the "rewrite to make things async" approach. We should be able to make a change in CLIEngine to do this without a major rewrite.
Ok, I think I finally figured out a good way to check if ESLint is CPU bound vs. IO. What I did is I ran eslint on the same set of files (544 files in 234 directories) first with just one rule on (`yoda: [2, "never"]') and then with .eslintrc file from ESLint repository (so a lot of rules on). As far as I can tell, the amount of IO should be exactly the same in both cases, but number of CPU cycles will be significantly different. This doesn't mean that IO is not a problem, but at least it's something. And then numbers where as follows:
Test #1: 9.630s
Test #2: 1:27.880s
Based on dramatic difference in performance, I want to say that we are not IO bound. Next step would be to create threaded prototype of some sort and see what the difference in performance is going to be.
I'm still for 'per process' concurrency.
Ok, I created a very simple prototype of "process-pool" that will spawn a bunch of processes based on the number of cores available and send files to those processes for linting. I had to turn off reporting (formatters are not invoked) but here are the numbers: Single process linting on 544 files in 234 directories took 9.585 seconds. Same files took 4.168 seconds over 8 processes (I have 8 cores). I still need to do some major cleanup on "process-pool" and everything up to and including CLIEngine needs to be turned into async code, but it looks like it might be worth it.
@ilyavolodin, is your prototype public? would love to see it. Currently the cache module uses a statSync, for simplicity sake. but it would be nice to see if that can be done async as well
@royriojas Not yet, I only wrote pooling library for child_process
and even that needs some major cleanup before I would be comfortable posting it anywhere:-) I didn't convert anything to async code yet, I just added process.exit(0)
once all of the linting is done (so no reporting of any kind for now). I also commented out all of the cache code for this test, because it was hard to find a spot to plug my code in with caching on.
Just wanted to chime in. We have a medium size project, linting 1370 (larger) files with a good number of rules in a powerful VM, and it takes about 30 seconds. It's manageable, but painful.