Support Multiplexed TypeScriptCompile Worker with Node Worker Threads
🚀 feature request
Relevant Rules
ts_library
Description
Bazel is adding support for multiplexed workers and Node now supports worker threads. Combining the two we could get parallel compilation with a shared in-worker file cache. This would be pretty cool and might have significant performance and resource benefits, so I thought I'd put it out here for discussion. Thoughts?
Cool! However, to make it possible, workers now need to use messages to access the shared cache, which could be the bottleneck here. Also, I doubt if typescript ASTs are serializable.
I've been thinking/discussing this.
I think we should build a clean-room implementation of ts_library that brings in only the minimal bits - just run TypeScript with its composite projects feature - and puts them in the right place. For example I don't think the program should worry about protocol buffers - the --require shim in bazel_require_script.js should marshal/unmarshal the worker protocol.
@ashi009 I don't know if we need to share a cache between worker threads. The simplest thing is to run tsc --watch in each worker thread, and establish affinity between each tsconfig.json (program) and a particular worker. Each worker sees only that one program over its lifecycle.
@JaredNeil when would you like to start on this? I'm on vacation next week and we are shipping 1.0 end of November, so maybe we could collaborate on a branch during December?
I really haven't thought much about it. I just think it could save us a lot of memory when compiling all of our typescript if we could share the cache.
@ashi009, postMessage doesn't have to be able to JSON stringify the message. From the docs:
Sends a JavaScript value to the receiving side of this channel. value will be transferred in a way which is compatible with the HTML structured clone algorithm.
Is that clone going to cause a ton of overhead? Maybe.
@alexeagle Yeah, that could probably work. Not really a blocker for us, so whenever there is time for it.
Discussed with @rictic today, still feels promising. We'd want it for google-internal too.
@JaredNeil with @alexeagle‘s affinity approach there will be no need to share the cache. Problem solved :)
(yet I still doubt if passing a complex object through message is possible)
I might go ahead and give it a try if I have time next week.
Maybe I'm misunderstanding how TS type checking works, but won't every worker's cache duplicate all of the transitive dependencies of the the projects that are assigned to that worker, so all of the shared libraries (node_modules stuff at the minimum) would be duplicated for each of the workers?
ts_library generates a tsconfig file for each target, and it only depends on the generated .d.ts for the transitive dependencies. So I think keeping a few copies of those won't be as expensive as caching the actual programs (what we have now).
It's definitely worth to write a small prototype to find out.
I bet there is a lot of memory wasted because the transitive closure of .d.ts probably does include lots of overlap (lib.dom.d.ts for example)
But I think it's a good idea to imagine how this ought to work outside of Bazel, and then model closely on that. I think you'd have to run several tsc --watch for your several composite projects, right? Meaning, tsc itself doesn't know how to rebuild a dependency of the project you're watching. If that's true, then the wasted memory is the same.
I think we should have a branch somewhere where we could start putting together a prototype. I'd love to see how the multiplex worker protocol should look. I can give you both commit on a branch here, so we don't need to work from a fork.
I went to read up on https://www.typescriptlang.org/docs/handbook/project-references.html
TypeScript handles it by incrementally rebuilding the dependency graph. That sounds like it's probably Bazel's job and we don't want to pass the --build flag to TypeScript, but we should evaluate both models I think. Maybe TS does a good job of this and we can host their build-the-graph model somehow. But I don't have a mental model of how that would work and it's certainly surprising from a Bazel idiomatic point of view. Maybe @mmorearty has some ideas about it from implementing https://github.com/Asana/bazeltsc
If we don't pass --build to TypeScript then it seems like users point to other ts_library targets in their deps and we turn that into references in the generated tsconfig.
Any update on this? Sounds very interesting.
I’m also interested in support for multiplex workers in @bazel/worker. Is this the right issue for that request or will this issue likely be solved in a manner specific to TS?
Adding multiplex support in @bazel/worker seems like the right first prerequisite step.
It's a little tied up with google-internal right now. The @bazel/worker package implementation is just to copy some files out of rules_typescript, which has its source-of-truth in google3 internal repo, and we've had a lot of trouble landing PRs there because no one at Google really seems to own it right now. I can ask a favor to get some attention for specific changes, but then if the change is intrusive in some way, or breaks anything in google's codebase, then it's a lot of work for someone to fix that up. I imagine it would be a lot easier if someone at Google wanted to use that multiplex support soon afterwards. Maybe @nex3 or @alxhub has some idea.
Another option is to just fork that code into rules_nodejs and add multiplex support for the rest of us. In that case it's hard to see a path to getting rules_typescript (and ts_library) refactored to use an external library for the worker mode though. I think this would only help with ts_project and rollup_bundle and other external rules to have faster workers.
FWIW I'm most likely to work on https://github.com/bazelbuild/rules_nodejs/issues/1726 next but no idea when.
That’s fair. I’m not very aware of the size and complexity of the worker source code or it would be reasonable to fork it. If the implementation was greatly different, I’d be in favor of implementing multiplex support in rules nodejs, still copying the original worker mode, and exposing both under the common package bazel/worker. That of course only makes sense if there is little reuse between implementations; I don’t know.
It just copies these three files https://github.com/bazelbuild/rules_nodejs/blob/master/packages/worker/BUILD.bazel#L28-L46 and the source code is just 200 lines https://github.com/bazelbuild/rules_typescript/blob/master/internal/tsc_wrapped/worker.ts
I'm not 100% clear on what "multiplexed workers" means, but it would definitely be useful for the PostCSS rules in particular to be able to piggyback on whatever work/code can be shared in a common location.
Multiplex worker is a worker that can handle multiple jobs simultaneously, it’s an optimization on top of a normal worker, who can handle only a single job at a time.
Postcss could leverage a standard worker today, by using the @bazel/worker npm package and implanting the worker loop pattern.
@alexeagle i might be in favor of forking that then, if it’s so small. What do you think?
Yeah let's pull our own copy. At the same time, I'm constantly tripping over the need to load a .proto file from disk at runtime - the require location from runfiles is a pain in the butt. We should try really hard to inline that into the JS file. Also if we could drop a runtime dependency on long/protobufjs it would be great.
We should think a bit on whether parts of this could live in a --require script so you could run a vanilla node tool under a worker without always having to write your own version of it using their API.
Have any time to contribute it @joeljeske ?
Yea I could probably find some time to contribute. I would like to better understand the desired api more. I of course like that the idea of any node tool being drop in, but not quite sure how that would work or what it would provide. Are you talking about it emitting emulated file change events? Or more primarily as full re run of a npm_package_bin tool?
As far as multiplex workers, should this package have a pool of node worker Threads?
https://github.com/bazelbuild/rules_nodejs/issues/1726#issuecomment-633651133 has some thoughts about how we might be able to have a --require mode of the worker script. In this usage, the tool would have its own FS watcher running as usual, and makes its own decisions about what work to do, without consulting the WorkRequest protos that arrive from bazel. It has some tradeoffs, and I think it should probably be a separate discussion from porting over the current worker package which has the more Bazel-idiomatic execution model. So let's maybe put that on hold and start with the part we know how to do.
Yes, I think if the worker helper is getting concurrent requests, it should be its job to create worker threads. The API from users should still just be to hand us a callback of what to do for each WorkRequest, I think they shouldn't have to worry about concurrency or threads?
Ok that makes sense, I'd be interested on the design of that and will be keeping an eye on that ticket.
I was thinking more about a worker thread pool, and I think there should be some options; I could imagine some build steps that may be i/o heavy but not computational heavy that would benefit from a single thread managing many concurrent jobs by having its buildOnce invoked multiple times not in a thread. But of course there are other builds that may benefit from a worker pool. Do you think the consumer of bazel/worker should be able to set an optimal number of concurrent jobs per thread?
Related, is there any benefit from a multiplex worker if we are only giving one job to a thread at a time? Is that any/much more efficient than bazel's normal workers where bazel manages the pool? (I'm not very sure of the resource consumption benefits of node worker threads)
No longer in scope for rules_nodejs which only supplies the Node.js toolchain as of v6.0.0.
Downstream canonical JavaScript + Node.js ruleset is now https://github.com/aspect-build/rules_js.