node icon indicating copy to clipboard operation
node copied to clipboard

N-API: An api for embedding Node in applications

Open empyrical opened this issue 5 years ago • 50 comments

Is your feature request related to a problem? Please describe. Right now there isn't a documented/stable way to use Node as a shared library inside of an application. Were one to be made using N-API, this would open up using Chakra in addition to V8 in an application.

Describe the solution you'd like I would like for there to be stable APIs in node_api.h for creating/managing a Node environment.

A function that does this could hypothetically look like:

NAPI_EXTERN napi_status napi_create_env(int* argc, const char** argv, napi_env* env);
// Start the node event loop
NAPI_EXTERN napi_status napi_run_env(napi_env env);
// Cleanup (e.g. FreeIsolateData, FreeEnvironment and whatever else needs to be ran on teardown)
NAPI_EXTERN napi_status napi_free_env(napi_env env);

The embedder could get this environment's libuv loop using napi_get_uv_event_loop. But I would also like to have open the possibility of providing my own libuv loop that I have control over to help integrate with other event loops (e.g. Qt's event loop). This could look like:

NAPI_EXTERN napi_status napi_create_env_from_loop(int* argc, const char** argv,
  napi_env* env, struct uv_loop_s* loop);

Keeping the event loop going (using uv_run on the env's loop) would then be the embedder's responsibility.

Also, right now methods like node::CreateEnvironment seem to always jump into a REPL, unless you provide a string to evaluate or a file to run. Tweaks to help make this nicer to use for embedding will have to be made.

These APIs are just hypothetical, and will probably change when an actual attempt to implement them is made.

I am up to trying to implement this, but I would like to see what kind of discussion happens first and what other ideas people have before I start.

Implementation Progress

  • [ ] Create a clean non-NAPI way to use Node embedded
  • [ ] Create NAPI functions for creating and managing environments
  • [x] Create a NAPI function for evaluating a string (exists in NAPI v1: napi_run_script)
  • [ ] Create a NAPI function for running a script from file
  • [ ] Investigate if this can play nicely with worker_threads.

Describe alternatives you've considered I've tried using the unstable APIs, and they aren't fun to keep up with 😅

For discussions on how the shared library can be distributed, see this issue: https://github.com/nodejs/node/issues/24028

empyrical avatar Oct 04 '18 15:10 empyrical

I've tried using the unstable APIs, and they aren't fun to keep up with :sweat_smile:

A big part of that is that they haven’t ever been designed as a coherent API (or designed at all, really), and we would likely need to iterate on them a bit more before they are stable – which is probably also the point where we can start to talk about enabling N-API versions of them.

If you want to work on this, good starting points might be https://github.com/nodejs/node/issues/21653#issuecomment-425926233, or splitting CreateEnvironment() into a function that, well creates the Environment, and one that calls Environment::Start() under the hood?

addaleax avatar Oct 04 '18 15:10 addaleax

Thanks for helping point where to start! I also noticed some relevant TODOs in 'node_worker' that would get resolved by more stable apis for this.

empyrical avatar Oct 04 '18 17:10 empyrical

@rubys another person we should loop into discussions/team about use cases/testing/api for using Node.js as a shared library.

mhdawson avatar Oct 04 '18 22:10 mhdawson

First observation: we should plan to move to having --shared as the default for both CI and releases. This would make releases include a shared library that could be used by third parties. Unscientific comparison of results on Mac OS/X, the combine executable + dynamic library would be a total of 0.1% bigger than a standalone executable.

Second, I would suggest that one of the goals be to allow electron to be built using exclusively NAPI interfaces. See electron/atom/app/node_main.cc.

This means that in addition to Create and Destroy environments, there would need to be an interface to execute a script in an environment, and to evaluate an expression in that environment.

rubys avatar Oct 05 '18 01:10 rubys

Like - making the node command basically just be node_main.cc that links against libnode? Would be very nice! And would be nice to include CMake, pkgconfig modules for finding libnode that would ship with it while we're at it too.

empyrical avatar Oct 05 '18 02:10 empyrical

@empyrical today if you do the following on Mac or Linux:

./configure --shared
make -j4

You end up with out/Release/node and out/Release/libnode.67.dynlib or out/Release/lib.target/libnode.so.67. Adding additional NAPI apis would be straightforward; I'm merely stating that it should be goal to add enough APIs to make electron's node_main.cc not need to depend on any other APIs.

But again, we would either need to include these libraries in the existing releases or have separate releases.

rubys avatar Oct 05 '18 02:10 rubys

Oh - I misunderstood. I thought you meant only building --shared version of Node, and making the node executable you use from the cli just very small executable that links against libnode

empyrical avatar Oct 05 '18 02:10 empyrical

@empyrical that's actually what --shared does. Here are the sizes of the output files on Mac OS/X:

$ ls -l out/Release/node out/Release/libnode.67.dylib 
-rwxr-xr-x  1 rubys  staff  40410544 Sep 29 16:33 out/Release/libnode.67.dylib
-rwxr-xr-x  1 rubys  staff      9208 Sep 29 16:33 out/Release/node

rubys avatar Oct 05 '18 03:10 rubys

Just two quick things to note:

  • I don’t know if that’s implied here, but I don’t think we can get away with a default where people have only a libnode + wrapper available as part of the release tarballs
  • Using --shared is definitely something that embedders will tend to do more often than others, but it’s orthogonal to the Embedder API by itself

addaleax avatar Oct 05 '18 06:10 addaleax

Curious for some thoughts with regards to worker_threads: If you create multiple envs, should they all be "main threads" with a threadid of 0 and workers for envs would be created with a separate hypothetical API, or should the first one created be the "main thread", and subsequent ones be considered "workers" with incrementing threadids?

And should the "main thread" only be allowed to be made in the process' main thread? JS code that checks worker_threads.isMainThread to see if it's safe to do something, e.g. call functions in a GUI binding (which typically only work in the main thread) may have issues if a "main" js thread isn't truly in the process' main thread.

Maybe there should be a NAPI function for creating a "main" env, and then a different one for subsequent ones?

Basically:

// Any more than one invocation per process would result in an error napi_status
NAPI_EXTERN napi_status napi_create_main_env(int* argc, const char** argv, napi_env* env);

// Parent env should also show up as parentPort on worker_threads
NAPI_EXTERN napi_status napi_create_env(napi_env parent_env, napi_env* env);

empyrical avatar Oct 05 '18 06:10 empyrical

I don’t think we can get away with a default where people have only a libnode + wrapper available as part of the release tarballs

Why not?

rubys avatar Oct 05 '18 08:10 rubys

Why not?

IMO:

  • node executable probably enjoys the most compact binary for a language runtime of all time - no linkage dependency other than the c|c++(rt)
  • embedding use cases may be too small to warrant a change in the default in favor of those.

gireeshpunathil avatar Oct 05 '18 08:10 gireeshpunathil

node executable probably enjoys the most compact binary for a language runtime of all time - no linkage dependency other than the c|c++(rt)

I'm clearly not understanding the downside. How is a 4M executable better than a 9k executable plus a 4M libnode?

Alternatives:

  • a 4M binary plus a 4M libnode.
  • Two separate release bundles (and sets of CIs), one with a standalone binary, and one with a libnode.

rubys avatar Oct 05 '18 08:10 rubys

downsides are mostly on unforeseen consumability issues at the end-user: for example user needing to explicitly set LD_LIBRARY_PATH or LIBPATH or PATH . There could be other platform specific disparity on symbol resolutions (precedence between the launcher and the library) , issues stemming from other node processes sharing the library etc.

gireeshpunathil avatar Oct 05 '18 09:10 gireeshpunathil

@gireeshpunathil others seem to manage without these problems; but in any case, what alternative would you suggest?

rubys avatar Oct 05 '18 10:10 rubys

I don't know. In most of my interactions with embedded users in nodejs/help repo, I see they build from source - not because they don't have a libnode, but because each one of them wanted to embed node at different levels of abstractions - 2 node::Init, 3 node::Start, create re-use env, re-enter env , multi-isolate spawning etc. necessitate them to build from source.

Once we have normalized these into one or two or three discrete entry points, we could expose (only) those that leads to improved consumption of libnode; and that should help us take an easy decision. One obvious route is to release regular (exe) and libnode separately, against a specified version.

gireeshpunathil avatar Oct 05 '18 10:10 gireeshpunathil

One obvious route is to release regular (exe) and libnode separately, against a specified version.

I agree, that is probably the best way forward. Dynamic linking can be pretty painful when copying executable files around (which even our own test suite does on a regular basis).

addaleax avatar Oct 05 '18 15:10 addaleax

I've created a demo of how this could work.

rubys avatar Oct 11 '18 18:10 rubys

This discussion related to https://github.com/nodejs/Release/issues/341 as well. If we had a Development kit and a Deployment kit (or equivalent) then we could add a shared library in addition to the existing exe without concern over the additional size.

mhdawson avatar Oct 16 '18 22:10 mhdawson

I think that the shared library stuff is worth an issue of its own, imo! sadly some questions i had about what the n-api could look like got buried by this talk. (I can edit in a link to the top level issue if one exists)

edit: link to the issue: https://github.com/nodejs/node/issues/24028

empyrical avatar Nov 01 '18 22:11 empyrical

I'm clearly not understanding the downside. How is a 4M executable better than a 9k executable plus a 4M libnode?

I think people value the "single file" node binary approach. Being able to move the node executable around by itself has benefit (IMO), and for a lot of use-cases disk space is cheap, so that's less of an issue.

Two separate release bundles (and sets of CIs), one with a standalone binary, and one with a libnode.

Sounds like a win-win to me (hopefully not too much extra pain for CI / build).

gibfahn avatar Nov 02 '18 14:11 gibfahn

Going to close this for now and remake this issue when I've got time to try and implement the N-API stuff.

I made a new discussion for the shared library stuff here for those interested: #24028

empyrical avatar Nov 02 '18 14:11 empyrical

ping @nodejs/n-api, would you consider adding this to your backlog?

refack avatar Nov 02 '18 20:11 refack

@empyrical I'm sorry I didn't voice my objection to some of the language and tone of comments that were made in this thread.

Personally I'd welcome it if you reopen this issue, since you raise valid points and make a well reasoned argument, that should be kept in consideration.

refack avatar Nov 03 '18 22:11 refack

I was not bothered by anything anyone said, I closed it because the conversation went off topic to a different (but very important!) subject. My plans are to make a new issue for this when I have time to try and implement, and put a link to the new issue I created for working out how the shared library should be distributed so the comments are just about NAPI.

empyrical avatar Nov 03 '18 23:11 empyrical

I've been wondering..do we need to attach the outline of a more embedder friendly API to the environment? Essentially there are at least multiple levels of abstractions here:

  1. A node that encapsulates the operations of v8 initialization, spinning off the libuv event loop, and doing a bunch of tracing etc. The embedder doesn't need to customize the engine/isolate and the even loop (like the third_party_main.js use cases, I think that's closer to what @rubys 's demo tries to address?)
  2. A smaller node that leaves the JS engine/isolate and libuv eventloop customizable (that may disable a few tracing/diagnostics stuff that are VM-dependent), but need to include the Node.js native module loaders - that's probably electron, node-chakracore and IoT.js's use case, also that's probably what our worker implementation want
  3. An even smaller node that only encapsulates the C++ bindings and JS source (node_javascript.cc) and leave everything else customizable to the embedder - so it need to also exclude a lot of environment-dependent logic, that's what our potential mksnapshot and mkcodecache would need

Maybe it would help if we start out refactoring the bootstrap process to reflect the different levels of abstractions, while adding cctests for them along the way (at that point these are all internals so we are free to change them), and then creating APIs on top of that would presumably be easier. Starting out with a specific set of APIs in mind and then implementing them in a top-down manner would be harder to get done given how entangled the current bootstrapping process is and how different the use cases are for different embedders, IMO.

joyeecheung avatar Nov 04 '18 09:11 joyeecheung

I'm reopening this issue as i plan to resume work on it (the API, not the shared library) when i return from nodeconf.eu.

rubys avatar Nov 04 '18 09:11 rubys

Sounds good! 👍 Since it seems you are working on the non-NAPI apis to be used as the basis for the NAPI apis, could you look into providing your own libuv loop for Node to use? Last time I tried that it wasn't so easy, and that code I made doesn't compile in today's Node even

empyrical avatar Nov 04 '18 17:11 empyrical

@empyrical do you have a link to code that used to work, or a test case, or even a sketch of what you would like to accomplish? I plan to include test cases with my implementation so that the functions advertised to work continue to work.

rubys avatar Nov 04 '18 20:11 rubys

I first did this way back when this was still a WIP: https://github.com/nodejs/node/pull/6994

I basically pieced it together by taking the official V8 embedding example on the V8 website, and looking at bits of node::Start to see if I could put in my own event loop that I control.

Looking at it again, I just used uv_default_loop which is exactly what node uses for its main environment anyways. So all I'd want to see is some kind of ability to manually run the events in node's event loop, instead of blocking until node is done like node::Start() normally does.

Here are some key parts:

// Create the loop
uv_loop_t *loop = uv_default_loop();

// Inside of an Isolate::Scope, I called these functions:
node::IsolateData *isolateData = node::CreateIsolateData(isolate, loop);
node::Environment *environment = node::CreateEnvironment(isolateData, context, testArgc, testArgv, 0, NULL);

// Made a QTimer on an interval to call the events in the event loop
auto timer = new QTimer();
QObject::connect(timer, &QTimer::timeout, []{
    uv_run(loop, UV_RUN_NOWAIT);
});
timer->setInterval(10);
timer->start();

There is probably a less hacky way to get it into Qt's event loop, but this worked for keeping a basic node http server going inside of Qt's event loop.

The ability to provide an arbitrary loop might be useful, however, for making a node environment in another thread.

empyrical avatar Nov 04 '18 21:11 empyrical