nodejvm icon indicating copy to clipboard operation
nodejvm copied to clipboard

Question : is it possible support multiple NodeJS main threads ?

Open yacota opened this issue 4 years ago • 4 comments

Actually I think that while developing NodeJVM you faced the same question, https://github.com/graalvm/graaljs/issues/12 :)

I am developing a proof of concept based on NodeJVM and the ability to launch a spring boot application that can delegate work to a nodeJS thread that will execute some jsdom stuff. So far so good, NodeJVM is really impressive, thanks!

While profiling the application I realized that spring-boot+reactor could overwhelm the NodeJS main thread, and thus I was wondering if it could be possible to spawn work to more than one nodejs event loop, I am not willing to share any state between Java and JS

yacota avatar Feb 11 '20 10:02 yacota

Well, NodeJS/GraalJS supports the JavaScript workers API, that's how the event loop dispatch is implemented internally. So I think the answer is yes. However, I'm not actually a JS dev so there may be limitations to workers I'm not aware of.

The trick is to refactor boot.js a bit so the basic construct of "create a Java LinkedBlockingQueue and loop on it from inside a worker thread" is factored out into a function. Instead of relaying the callback via parentPort to the NodeJS main thread, it'd just execute it directly. In fact it's not clear to me there's anything special about the NodeJS main thread, perhaps all worker threads are equivalent? If so then the NodeJVM code could be simplified somewhat.

Do you think you can tackle that yourself?

mikehearn avatar Feb 11 '20 11:02 mikehearn

Thanks for the feedback, So with your feedback :

  • change boot.js in order to bootstrap #cpus worker threads (reasonable?)
  • these worker threads will execute the "eval"(workerData.take()) by themselves instead of relaying in parentPort(maybe they could still use parentPort to inform about exit and error events?, some kind of tracking)
  • NodeJs class simplifications :
    • As far as I understood this check won't be needed, because here we will just be putting the runnable into the blocking queue in order to let workers compete to extract/execute the runnable https://github.com/mikehearn/nodejvm/blob/master/src/main/java/net/plan99/nodejs/java/NodeJS.java#L155
    • Do we still need this ? https://github.com/mikehearn/nodejvm/blob/master/src/main/java/net/plan99/nodejs/java/NodeJS.java#L202 Should we offer the context of the main one?.
    • We should improve this method by checking that the currentThread it's a NodeJs worker one (by using 'worker.threadId' or maybe Graalvm it's kind enough to put some appropriate name) https://github.com/mikehearn/nodejvm/blob/master/src/main/java/net/plan99/nodejs/java/NodeJS.java#L133

What do you think?

yacota avatar Feb 11 '20 15:02 yacota

BTW it seems that the worker thread has a non meaningful name

Screenshot 2020-02-11 at 16 21 15

yacota avatar Feb 11 '20 15:02 yacota

Yes, that sounds about right.

For (1) the purpose of that check is to provide expected re-entrancy semantics. If that wasn't the case then entering a NodeJS block from the NodeJS thread wouldn't work properly (I think). So it needs to be adjusted to not treat "the node thread" specially but it should still short-circuit dispatch when on the right thread already.

I'm not entirely sure how Contexts map to Workers. I suspect each worker thread has its own context, so, again, that would need to be generalised.

For (3) yes, same thing.

Basicallly we need to split the NodeJVM object into a NodeJSContext and NodeJVM class, where the latter is used for bootup and the like, and the latter maps 1:1 to worker threads.

For bonus points you could try customising ThreadPoolExecutor so the number of threads and the underlying task queues are managed automatically.

mikehearn avatar Feb 12 '20 14:02 mikehearn