node-api: use v-table to reverse module dependencies
Disclaimer
Please consider this PR as a proof of concept and a discussion starter. We discussed this design a bit in the latest Node-API meeting with @legendecas and @KevinEady, and this PR provides the specific implementation details to visualize the new ideas.
The issue
The recent investigation of the issue https://github.com/nodejs/abi-stable-node/issues/471 had shown that the current way how Node-API modules depend on API exposed by Node.js has a number of limitations and issues that are not simple to overcome especially if the modules are distributed as pre-built libraries. To be specific we have the following picture:
- On Windows the
.nodemodules are compiled with the delay loading dependency on thenode.exeprocess. Each modules is compiled with thewin_delay_load_hook.ccsource file where we define a global__pfnDliNotifyHook2that allows to bind to Node-API functions from current process if its name is different fromnode.exe, or from thelibnode.dll. If a module was compiled where such hook is not defined or cannot be defined, then such module cannot bind to Node-API functions for non-node.exeruntimes. - On Linux & MacOS the
.nodeuses weak function binding that allows to bind to any function with the same name present in the process. This does not work on Android where the modules are required to have a strong binding. - The current system does not allow to use in the same process multiple JS runtimes that may load Node-API modules. For example, if
Word.exeuseslibnode.dlland React Native for Windows with Hermes JS VM, then it is not possible to specify the target runtime for a module. - When we implement Node-API bindings in other languages such as C# or Java, it is quite expensive to resolve all the ~150 Node-API functions by name.
The proposed solution
The proposal is to reverse the dependency. Instead of .node to depend on Node-API, we should make it to be the responsibility of the runtime to load the .node module and to inject its API into the module. This way it does not matter what is the name of the process or the embedded runtime DLL name. The same pre-built module can be loaded from different JS runtimes such as hermes.dll or react-native.dll. (Though, the module can be used only by one runtime at this point.)
This PR shows how such injection can be implemented:
js_native_api_types.handnode_api_types.hdefine thenode_api_js_native_vtableandnode_api_module_vtablestructs with entries for all Node-API functions. They are sorted so that all new function pointers are added in the end of the struct. The main requirement is that we must never change the position of the entries except for the experimental functions.- We have to have two different v-tables because the
js_native_api.handnode_api.hare two sets of functions that can be used in different scearios independently. js_native_api_v8.ccandnode_api.ccinitialize the struct instances with all Node-API function pointers.- The
node_api.hchanges theNAPI_MODULE_INITmacro so that each.nodemodule defines globalconst node_api_module_vtable* g_node_api_module_vtableandconst node_api_js_native_vtable* g_node_api_js_native_vtablevariables and exports the newnode_api_module_set_vtable_v1function that is called from thenode_binding.ccto inject the Node-API v-tables. - The
js_native_api.handnode_api.halso contain the implementation of all Node-API functions asstatic inlinefunctions that use the global variables. These functions become part of the.nodemodule after compilation. They do not exist when we compile Node.js code. The new macroNODE_API_MODULE_USE_VTABLEcontrols whether the header files define Node-API function prototypes or the newstatic inlinefunctions. It also controls if thenode_api_module_set_vtable_v1function and the global vtable variables are defined.
After this change all test modules compile and run without changes.
To test the vtable approach the NODE_API_MODULE_USE_VTABLE is added to two tests: node-api\1_hello_world and js-native-api\7_factory_wrap. When we look at the imports of the modules that use the v-table approach, we do not see any imports from the node.exe. It means that we can stop using the win_delay_load_hook.cc and be able to load the pre-built .node module from any runtime that supports Node-API.
The idea to use a v-table for the API is not new. E.g. Java JNI API is also based on a v-table.
Review requested:
- [ ] @nodejs/gyp
- [ ] @nodejs/node-api
From the PR description this seems super valuable and a great built-in alternative to the weak-node-api library we've been working on to add Node-API support to React Native. As such, I'd be happy for us to adopt this approach over the stuff we have now 👍
The limitation around the module being bound to a single runtime, might not be an issue as a multi-runtime host can inject functions that deal with that internally.
I'm left wondering how (if at all) add-ons which are statically linked into the process are affected by this proposal? I guess not at all, as they can still register themselves and call into the global Node-API functions resolved at link time.
nice! node-api symbol visibility has been a pain for us in deno so we'd love to adopt this approach as well.
There might be a potential a potential coordination problem, where the addon tries to initialize itself before the vtable gets injected (set) by the host. I don't see that as an issue when the addon relies on "symbol based" module registration, since the host can ensure to call the initialize function after setting the vtable, but how about an addon trying to call napi_module_register when loaded?
but how about an addon trying to call napi_module_register when loaded?
these should probably be exclusive modes, so napi_module_register wouldn't be called, or could only be called from within the host call to the inject function.
There might be a potential a potential coordination problem, where the addon tries to initialize itself before the vtable gets injected (set) by the host. I don't see that as an issue when the addon relies on "symbol based" module registration, since the host can ensure to call the initialize function after setting the vtable, but how about an addon trying to call
napi_module_registerwhen loaded?
The napi_module_register usage is deprecated. No new code is supposed to use it anymore.
We used to have a deprecated attribute, but we found that some developers use it to register modules explictly. E.g. when the modules are part of the host executable.
I do not have a good answer to that besides that the exisitng public API is still there and user can use as before.
I am going to add a conditional flag and restore the deleted test as @legendecas suggested. This test is using the napi_module_register method and it can be used as a show case how to use the Node-API directly when needed.
I'm left wondering how (if at all) add-ons which are statically linked into the process are affected by this proposal? I guess not at all, as they can still register themselves and call into the global Node-API functions resolved at link time.
Right, I would expect it too, but we should verify and test this scenario.
The issues that I am facing on Mac and Linux are due to the use of real "C" compilers where the inline keyword has a different semantic than in "C++". Changing it to static inline generates its own set of issues. Thus, I am still working on it.
Nice!👍 In Lynx/PrimJS, we are currently using a similar vtable approach to address the needs of multi-runtime injection. However, our previous API was not fully aligned with the Node-API standard, which is a problem I have been working to fix recently. We’re thrilled to see this solution will potentially be natively integrated into Node.js, and we’re also more than happy to adopt this approach.
Nice!👍 In Lynx/PrimJS, we are currently using a similar vtable approach to address the needs of multi-runtime injection. However, our previous API was not fully aligned with the Node-API standard, which is a problem I have been working to fix recently. We’re thrilled to see this solution will potentially be natively integrated into Node.js, and we’re also more than happy to adopt this approach.
It is great to hear it! Any suggestions to improve code in this PR to fit your scenario are welcome.
Codecov Report
:x: Patch coverage is 84.61538% with 4 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 88.53%. Comparing base (4ea921b) to head (713c148).
:warning: Report is 3 commits behind head on main.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/node_binding.cc | 81.81% | 2 Missing and 2 partials :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #60916 +/- ##
==========================================
+ Coverage 88.52% 88.53% +0.01%
==========================================
Files 703 703
Lines 208396 208406 +10
Branches 40185 40191 +6
==========================================
+ Hits 184483 184521 +38
+ Misses 15918 15896 -22
+ Partials 7995 7989 -6
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/js_native_api_v8.cc | 76.63% <100.00%> (+0.02%) |
:arrow_up: |
| src/node_api.cc | 75.21% <100.00%> (+0.06%) |
:arrow_up: |
| src/node_binding.cc | 83.52% <81.81%> (+0.77%) |
:arrow_up: |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
:+1: Just as a comment, I was working on something very similar, but I must admit that imho this is the most flexible approach, which opens new possibilities, like enabling validation layers and middlewares (the runtime can "replace" or "patch" the original Node-API function pointer with functions that perform extra validation or logging, etc. completely transparently to the "consumer") -- a little bit like optional Validation Layers in Vulkan API. Lastly, V-table approach also decreases the cost of dynamic symbol resolution, etc.
From ABI stability, adding new function pointers at the end of the struct/V-table is ABI-stable. Lastly, there are some runtimes that support multiple JS engines, and they would need to dispatch to the correct V-table (based on the node_env) -- I believe both PrimJS and OpenHarmony's Native Engine are doing this. This is also one reason why JS engines should not provide Node-API symbols, otherwise they will trigger symbol conflicts.
What might be good to consider is adding a function that returns a V-table for a given version. This way, if a Node Addon is using say N-API version 8, and the runtime supports, say, N-API version 10, such node_api_get_vtable(NAPI_VERSION_8, flags) might return a properly initialized V-Table with newer functions replaced with an assertion or something. Just a random idea inspired by an old blog post...
edit: If we can bump NAPI_MODULE_VERSION then maybe its the best time to introduce napi_register_module_v2() which takes a 3rd argument: a versioned struct/v-table with function pointers to query the host/runtime (e.g. ask for V-Tables, require other addons, etc.). WDYT?