node icon indicating copy to clipboard operation
node copied to clipboard

Extend Node-API to `libnode`

Open mmomtchev opened this issue 1 year ago β€’ 12 comments

This PR adds support for instantiation and further interaction with an embedded version of Node.js (libnode) entirely through Node-API.

Refs: #43516

mmomtchev avatar Jun 22 '22 22:06 mmomtchev

Review requested:

  • [ ] @nodejs/gyp
  • [ ] @nodejs/node-api

nodejs-github-bot avatar Jun 22 '22 22:06 nodejs-github-bot

We discussed this PR in the 24 June 2022 Node API meeting. Our main concerns are:

  1. As @addaleax mentioned, people who are embedding Node into their application may require specific, fine-grained startup and configuration of the engine. We fear that creating any Node-API wrapper around engine initialization would bypass the ability to modify the engine startup properties as necessary. Furthermore, if there were engine-specific configurations, these concepts may not extend to other engines.
  2. This PR creates V8-specific objects like the platform. Other engines, eg. Hermes, does not have these concepts and it may be incorrect to label these types as such. Additions to Node-API must be platform-agnostic. We have some baseline requirements in Contributing a new API to Node-API.

As of now, Node does not provide any APIs that are "Node-engine specific", because our Node-API is meant to be engine-agnostic. The existing Node-specific APIs, eg. node::InitializeNodeWithArgs are available within the C++ language but not in an ABI-stable C language. It may be beneficial to create some APIs that are outside of the Node-API scope, but are still C-based and provide ABI stability in order to perform this embedding scenario, where we can ignore the platform-agnostic requirement, ie. inside a different header file.

KevinEady avatar Jun 24 '22 15:06 KevinEady

@KevinEady

  1. I agree that this API would not cover all use cases - in particular Electron has very specific requirements - but if you look at these exchanges, you will see that even their team was interested in some form of ABI or at least ABI stability. Still, such an API will be sufficient for a very significant portion of the uses cases. What needs discussing is:
  • How can greater customisability of the engine startup be supported and where should the line be drawn
  • Should we support mixed use with conversions between v8::Local and napi
  1. I don't think that napi_platform is V8-specific - every engine would have a common main initialisation procedure and will support a single or many independent environments.

mmomtchev avatar Jun 24 '22 15:06 mmomtchev

We discussed in the 1 July 2022 Node API meeting, and we did want to bring up again the comment we had before:

It may be beneficial to create some APIs that are outside of the Node-API scope, but are still C-based and provide ABI stability in order to perform this embedding scenario, where we can ignore the platform-agnostic requirement, ie. inside a different header file.

There is a logical separation between "module extensions API" and "node hosting API" so it may be beneficial to separate this embedding API into a separate header.

Regarding:

Should we support mixed use with conversions between v8::Local and napi

Can you clarify this a bit?

KevinEady avatar Jul 01 '22 15:07 KevinEady

Should we support mixed use with conversions between v8::Local and napi

Can you clarify this a bit?

Public API for retrieving the V8 reference from a napi* handle and a constructor for creating a napi* handle from a V8 reference. It goes against the Node-API logic, but it would allow extending Node-API without modifying it.

mmomtchev avatar Jul 01 '22 18:07 mmomtchev

@KevinEady @addaleax @rvagg My goal is to be able to support installation of this (libnode + this API) along a NodeSource Node.js package

Basically, I have two choices:

  • Either libnode & libnode-dev are completely independent of nodejs and install headers to /usr/include/libnode
  • Either I install this single new header file /usr/include/node - and then I require that nodejs is present and it is a specific version

Whatever I do at this point will likely stay. Do you have any opinion?

mmomtchev avatar Jul 05 '22 14:07 mmomtchev

@mmomtchev Two things:

  • I would not split header files from the existing header bundle, since the default for embedders will still be using the regular C++ APIs that are also used by addons
  • I would consider shipping the shared library variant of Node.js completely separate and independent from any embedding API additions or changes

addaleax avatar Jul 05 '22 15:07 addaleax

Ok, this is what I currently have - libnode93 and libnode108 which can be installed independently of nodejs - all three can be installed at the same time - and libnode-dev which is mutually exclusive with nodejs - because both install /usr/include/node

If/when this PR goes through, libnode-dev won't be necessary anymore

mmomtchev avatar Jul 05 '22 15:07 mmomtchev

@addaleax And do you have an idea for an elegant solution for having top-level functions appear in the global object the way they do in the REPL, besides vm.runInThisContext()? Ideally I would like to hide this from the end user.

mmomtchev avatar Jul 05 '22 17:07 mmomtchev

@mmomtchev If somebody wants that, they can just do the same thing the REPL does to make that happen, right?

addaleax avatar Jul 06 '22 09:07 addaleax

@addaleax I have a problem with restarting the event loop since NodePlatform::DrainTasks :

https://github.com/nodejs/node/blob/3738b57102b5fad003e9e3ec7c936136980556bc/src/node_platform.cc#L440

can be run only once per isolate. If I don't drain the tasks, some pending Promises (the dynamic import for example) do not get resolved. If I drain them, they do not get rescheduled. This is valid even when using the existing embedders' API. Do you have any opinion about it?

UPDATE: It is because node::PerIsolatePlatformData::FlushTasks is not called when using the embedders' API UPDATE2: I found my problem (a resolved Promise does not keep the event loop from exiting) - but from having spent so much time in this part of the code, I think that all that FlushTasks/DrainTasks mess is completely useless. It is there for only one reason - because the flush_tasks_ async handle is unrefed to not prevent the event loop from exiting. This mechanism is called only a handful of times and if it was calling async_init / async_close every time, all those special edge cases wouldn't have been needed.

mmomtchev avatar Aug 07 '22 13:08 mmomtchev

I was experimenting with using Node.js to support WASM containers. For that I needed to use libnode with crun. Unfortunately crun is current a C project and integrating the C++ API from libnode was going to be difficult (if crun would even consider moving to C++ versus C).

The model is to dlsym the required methods as crun support multiple engines and only wants to load the shared libraries needed.

With this PR is was relatively straight forward to get it working by using the new methods to start/get an env and then existing node-api methods to create and start a script to run.

At this for the simple run a script case it seemed to be enough. was experimenting with using Node.js to support WASM containers. For that I needed to use libnode with crun. Unfortunately crun is current a C project and integrating the C++ API from libnode was going to be difficult (if crun would even consider moving to C++ versus C).

The model is to dlsym the required methods as crun support multiple engines and only wants to load the shared libraries needed.

At this for the simple run a script case it seemed to be enough.

You can see what it used here: https://github.com/mhdawson/crun/blob/node-wasm-experiment/src/libcrun/handlers/wasmtime.c

mhdawson avatar Sep 20 '22 19:09 mhdawson

With respect to:

I would strongly encourage you to go through the current embedder API and understand how embedders use it. I would love to see a C embedding API, but since the goal of Node-API-style APIs is to provide long-term stability, it’s going to need to be an API that is designed to account for multiple usage patterns (e.g.: integration with libuv, usage with a node-addon-api-style C++ wrapper) and probably some kind of API versioning in advance (since Node.js and V8 embedding changes their API definitions much more frequently than JS changes as a language).

I think having either an Experimental version of the C embedding API as part of Node-API, or possibly a Experimental version of the C embedding API which uses Node-API but which is not part of it to start would be a good way to get feedback so that we can work through the issues mentioned. @addaleax would having the C API outside of the Node-API to start help to address some of your concerns about the potential faster pace of change? In that case the path would be

  1. C-APIs, using Node-API for types etc, but separate headers and naming to make it clear the APIs are not covered by the ABI stability guarrantee to start.
  2. Try to raise awareness of the APIs (through social, etc.) to get as many people to try them out, provide feedback.
  3. At some later point we could move those APIs into Node-API once we are comfortable we have enough experience to believe they are unlikely to need to change significantly/often.

This would let us see the rate/pace of change of the C APIs over time through GitHub history as part of figuring out if/when we are ready to move to ABI stable. The same would be true if we introduced them as experimental but the initial separation might make it more obvious/clearer.

mhdawson avatar May 26 '23 16:05 mhdawson

Alas, lately, the main theme of all my social media accounts is the fact that I barely have enough money to eat and I have been served an eviction notice - because I have been unemployed since 2020. This happens after my previous employers tried to convince me that I was suffering from schizophrenia because I was hallucinating that people around me were posting dicks after a false rape accusation that was passed through the French criminal justice system. After I tried suing my employers, my lawyer kept telling me that he was waiting for the court to schedule a hearing, but in fact he had rendered all proceedings void. Since this day, I am having psychotic episodes at each interview - I usually have the feeling that the companies that try to hire me post the same images as my previous employers.

Given this rather particular situation of mine, I am afraid that my social media accounts wouldn't be the ideal location for this new API to gain any traction. However the original issue has a number of thumbs up, and it would probably be a much better place.

mmomtchev avatar May 27 '23 19:05 mmomtchev

@mmomtchev, may I help you to complete this PR? We had discussed this PR in the Node-API meetings and we believe that this work can be quite beneficial in a number of Node.js embedding scenarios. I can submit changes directly to your PR (the preferred way) or open a new PR (less preferred option). Let me know which way you would prefer to go. In any case, you are the author of this PR, and my part will be just adding a few finishing touches to get it merged.

vmoroz avatar Jul 28 '23 13:07 vmoroz

@vmoroz You are welcome to push any modifications

I catched up with main and removed the deps/uv patch now that libuv has been updated.

There are a couple of issues that needs discussing such as the path. The embedded environment is neither an ES6 nor a CJS environment and does not really have a main script. I use Node.js' internal cross-platform exec path detection so that all require and import are relative to the binary of the main program that is linked to libnode. This variable gets passed as an argument in the bootstrap where it becomes a global symbol and I see that it clashes with mksnapshot. Both decisions - looking for modules relative to the binary - and passing this path as a new global symbol - need to be discussed.

I have also added the option for specifying an API level from the C/C++ API. The dynamic API level and the new realms are the two significant new changes that have the most repercussions for this PR.

mmomtchev avatar Jul 28 '23 18:07 mmomtchev

From discussion in the Node-api team meeting today, without looking to closely at the details it seemed reasonable that the APIs should take the path for which require and import are relative to as a parameter.

If the embedder wants to make that the binary Node.js is embedded into they can, but it also provides the option to do something else if that makes sense.

In terms of a news symbol we also thought that pretending that there was a main script by setting existing values such that it looks like a main script was run from the path provided to the API by the embedder and a pre-defined name for the main script might be an option. Does that make any sense?

mhdawson avatar Aug 04 '23 15:08 mhdawson

@mhdawson I have removed the custom path argument to the bootstrapper script that was causing a symbol collision. Normally, the embedded environment uses as much as possible from the normal import and require chains which means that it can be configured by using the NODE_PATH variable and should follow the usual directory relative to the package.json convention.

mmomtchev avatar Aug 04 '23 21:08 mmomtchev

@joyeecheung I can add another exec_path to Environment - lets call it modules_root that will implement configuring the root from NAPI and carry it in the process object. I can also simply modify the exec_path in this particular case, but it is not a very clean solution as it reuses an existing parameter with a new meaning.

I cannot derive from Environment without making a huge mess as it has its own factory. I will have to implement it in Environment, but use it only for when there is an embedded environment.

From what I understand, the decision to not support NODE_PATH for ES6 modules was a deliberate one. There is no official way of setting the root of ES6 modules?

mmomtchev avatar Aug 05 '23 09:08 mmomtchev

From what I understand, the decision to not support NODE_PATH for ES6 modules was a deliberate one. There is no official way of setting the root of ES6 modules?

I think that's part of the loader hooks https://nodejs.org/api/esm.html#resolvespecifier-context-nextresolve which is also experimental. Whatever introduced to the embedder API should resonate with whatever available to the userland loader, IMO.

joyeecheung avatar Aug 07 '23 12:08 joyeecheung

@vmoroz @gabrielschulhof @addaleax I see that now there is a node::InitializeOncePerProcess that does not return an array of errors - something that didn't really work before anyways - and takes only a single list of CLI arguments instead of the argv/exec_argv separation. This means that napi_create_platform_version must change accordingly.

mmomtchev avatar Sep 04 '23 15:09 mmomtchev

I adapted this to the latest changes in the embedding initialization which impacted my API as well - napi_create_platform and napi_create_environment . I opened #49512 and #49513 for the two unit test failures which are not related to this PR.

mmomtchev avatar Sep 06 '23 12:09 mmomtchev

May one ask what the holdup is on this PR? Has OP lost interest? I'm part of a project that relies on this API and we're not too keen on using an unofficial fork and having to build Nodejs manually for all the platforms we support.

samardzicneo avatar Feb 29 '24 08:02 samardzicneo

@samardzicneo do you have cycles to help get it across the line? I think @vmoroz also showed interest in moving this forward. @mmomtchev are you still going to have cycles or would it be good to have either @samardzicneo, @vmoroz or both try to move it forward?

mhdawson avatar Feb 29 '24 20:02 mhdawson

@samardzicneo do you have cycles to help get it across the line? I think @vmoroz also showed interest in moving this forward. @mmomtchev are you still going to have cycles or would it be good to have either @samardzicneo, @vmoroz or both try to move it forward?

Really, I'd love to do so, especially since I need this for a project of mine, but honestly, I don't trust my C/C++ skills enough to be sending PRs to Node πŸ˜…

Believe me, you don't want my messy code anywhere near your runtime.

samardzicneo avatar Feb 29 '24 21:02 samardzicneo

@mmomtchev , I have moved the embedded code out of js_native_api_v8.cc to the new file node_api_embedding.cc. The main reason is that we support a scenario where the js_native_api_v8.cc can be used outside of Node.js code on its own. (In node_api_embedding.cc line 81 is changed to have an explicit static_cast to suppress MSVC warning for int64_t to int32_t conversion). Please review the change.

vmoroz avatar Mar 08 '24 15:03 vmoroz

@mmomtchev , the code is rebased on the latest main code.

vmoroz avatar Mar 08 '24 23:03 vmoroz