node
node copied to clipboard
node-api: run finalizers directly from GC
The issue
Currently Reference
finalizers are run inside of SetImmediate
.
In case if user code creates a lot of native objects in the main script, it could cause a significant memory pressure, even if the objects are properly released. This is because they are "collected" only inside of SetImmediate
that follows the script run.
See the issue: https://github.com/nodejs/node-addon-api/issues/1140
In the https://github.com/nodejs/node/commit/a74a6e3ba131752225a527d915593d7e413b1594 commit the processing of finalizers was moved from the GC second pass to the SetImmediate
because:
- finalizers may run some arbitrary JS code while some other JS code being executed.
- finalizers may throw JavaScript exceptions which can affect behavior of other functions.
The solution
In this PR we are introducing new experimental behavior where the finalizers are run directly from the GC but they are not allowed to run JS code and must only call native code. Since the finalizers cannot affect the running JS code in any way, they are safe to run at any point.
- When the finalizers run from GC, we disable running JS code and any Node-API that may run JS code.
- Any unhandled Node-API error or JS exception will cause the process abort.
If a finalizer must run JS code, then it can do it by calling the new node_api_post_finalizer
method which schedules the finalizer run in SetImmediate
as it was before. The main difference is that previously we always scheduled finalizer runs in SetImmediate
implicitly, and now code must do it explicitly.
TODO:
- Discuss the approach with Node-API team
- Add new unit tests
- Add documentation for the new functions and data structures.
Review requested:
- [ ] @nodejs/node-api
Hi @nodejs/node-api ,
Can team members take a look at this PR and provide feedback to @vmoroz so he can understand if this approach works well before he continues going through unit tests and other implementation details? Thanks.
As we discussed today in Node-API meeting, one of the changes in this PR is the grouping of parameters inside of a new node_api_native_data
struct. This struct should allow us to evolve the set of parameters that can be associate with native data. I had also mentioned that we must be able to create a C++ template function that converts C++ lambda to that struct. Such conversion would be difficult to do if the native data and finalizers are passed as function parameters. I am going to add a unit test to demo it.
We discussed this in the 20 May Node-API meeting. This PR should be rebased and revalidated after https://github.com/nodejs/node/pull/36510 has been merged since both PRs touch finalizer code.
As discussed in the last node-api meeting, it would be worth exploring the possibility of introducing behavior flags to the initialization of an addon to invoke the finalizers in a more eager manner and disallow JavaScript evaluations in the finalizers. In this way, users can still have one single type of finalizer and don't need to distinguish them. We also don't need to introduce a bunch of new apis just for different finalizers.
We discussed in the 7 Oct Node API meeting that this PR is dependent on the feature-flags implementation inside #42557. Instead of adding the new methods (in PRs first post), we would be able to modify existing API behavior for object finalization.
The latest commit changes the PR based on the latest discussions that we had in Node-API meetings:
- In the new experimental version all finalizers are invoked directly from GC without access to JS.
- New
node_api_post_finalizer
method is added to run finalizer code as a part ofSetImmediate
where it can access GC.
Changed status to DRAFT until I have the full set of tests and changes for the docs. Otherwise, the rest of the code is in a relative good shape abd very close to the final change.
@vmoroz IMO the asan and macos tests are unrelated. I think they may pass if you restart the jobs.
@vmoroz OK, maybe asan is legit. It's too stubborn 😄
@vmoroz OK, maybe asan is legit. It's too stubborn 😄
@gabrielschulhof , It sounds plausible. Let me dig into it.
I ran make test-js-native-api and test-node-api locally with asan and they passed.
I ran make test-js-native-api and test-node-api locally with asan and they passed.
Awesome! The ASAN was failing for non-Node-API tests. They seemed to pass in the latest code after I rebased the PR. I just need to fix the linter issues.
@gabrielschulhof , @legendecas , @mhdawson , the PR seems to pass all verifications. Could you have to look at the code?
Atch! I played around with this branch, and I don't think we can use napi_cannot_run_js
, because we cannot conflate environment shutdown with gc running. If we did, we couldn't delete references during environment shutdown. I still think we should do this non-fatally, introducing a new status node_api_gc_running
. In retrospect, we should have probably called napi_cannot_run_js
node_api_environment_shutdown
. That would've been clearer and more specific.
After discussing this in our group meeting, I understand that, since there is not much a user could do with a status code indicating that JavaScript cannot currently be executed because a GC is in progress, it's actually preferable to fatal error in case the user calls Node-APIs which require JavaScript execution.
Do the APIs like napi_wrap and napi_unwrap also need to be protected, because they call HasPrivate
, GetPrivate
, and SetPrivate
, which are very similar to property accessors? They also call IsObject()
.
Do we need to protect the followings with CheckGCAccess
?
- napi_get_all_property_names
- napi_get_new_target
Do we need to protect the followings with
CheckGCAccess
?* napi_get_all_property_names
Has NAPI_PREAMBLE, therefore has the check.
* napi_get_new_target
Needs the check, because it calls into v8::CallbackInfo.
Do the APIs like napi_wrap and napi_unwrap also need to be protected, because they call
HasPrivate
,GetPrivate
, andSetPrivate
, which are very similar to property accessors? They also callIsObject()
.
They are protected inside of the Wrap
and Unwrap
functions that call NAPI_PREAMBLE
.
Do we need to protect the followings with
CheckGCAccess
?* napi_get_all_property_names
Has NAPI_PREAMBLE, therefore has the check.
* napi_get_new_target
Needs the check, because it calls into v8::CallbackInfo.
I have added the check to napi_get_new_target.
Great work!
CI: https://ci.nodejs.org/job/node-test-pull-request/54384/
CI: https://ci.nodejs.org/job/node-test-pull-request/54398/
Landed in b38e3124862f
Congrats @vmoroz on getting this PR landed! This has been a long time in the making with several users wanting this type of functionality 😄