node-addon-api icon indicating copy to clipboard operation
node-addon-api copied to clipboard

unbounded memory usage in unyielding JS jobs creating ObjectWrap-ed native objects

Open MichaelBelousov opened this issue 3 years ago • 10 comments

See this minimal reproduction repository.

When creating several instances of a C++ object that is wrapped by ObjectWrap, memory usage goes up forever unless the JavaScript job yields. Even though the JavaScript object is garbage collected and the JS heap size is stable. According to heaptrack, suspicious allocations are occurring in napi_wrap whenever the ObjectWrap is created. I based the C++ addon on this documentation.

Clone the above repo and run:

npm install # for node-addon-api and node-gyp dependencies
npm run build # C++ build
node ./index.js

here is the loop in the minimal repro:

for (let i = 0; i < MAX_ITERS; ++i) {
  const nativeObj = NativeObj.create();
  // uncomment this and the application will not leak
  // await new Promise(setImmediate);
}

On my Ubuntu 21.10 machine with Node.js 17.5, process.memoryUsage().rss stabilizes at ~103MB if yielding by uncommenting the await new Promise(setImmediate) line. Otherwise it goes up until the process is killed for eating too much memory.

Is this a memory leak? It does affect a larger addon from which I created the sample loop.

Running node with --trace_gc and tracking the JavaScript heap, external, and buffer sizes demonstrate that the process memory growth is not from there, and that full mark-sweep garbage collections are occurring in parallel.

MichaelBelousov avatar Feb 22 '22 04:02 MichaelBelousov

@vmoroz suggested that a possible way out is to manage our queue of finalizers ourselves, and to add an epilogue to each napi_* and node_api_* function that drains the queue. We already have such a queue of finalizers, because we drain it at environment shutdown. Some interesting avenues to explore:

  • What are the performance changes if we do add such an epilogue?
  • Can we simply drain the existing queue of finalizers (the one we drain during environment shutdown) as part of the epilogue?
  • Can we add a reasonable heuristic to only delete some of the items during epilogue, in case we find that sometimes very many of these accumulate?

@MichaelBelousov this is a known issue, and arises from the fact that we cannot synchronously free native memory during garbage collection, because freeing something on the native side may result in JavaScript getting executed, which cannot be done during garbage collection, because the engine is not in the right state for executing JavaScript. The only workaround currently is to design the code to basically "pace itself" when creating many objects.

gabrielschulhof avatar Feb 25 '22 16:02 gabrielschulhof

@gabrielschulhof is there a better workaround? Is there a way to manually drain the queue in the addon? Or is the best thing really to split up the execution (await new Promise(setImmediate)) on the javascript end like I suggested?

MichaelBelousov avatar Feb 25 '22 18:02 MichaelBelousov

@gabrielschulhof is there a better workaround? Is there a way to manually drain the queue in the addon? Or is the best thing really to split up the execution (await new Promise(setImmediate)) on the javascript end like I suggested?

Per our discussion today it does not seem that we have a better workaround today than awaiting Promises in the code. We assume that the finalizers to delete native code are run as micro-tasks. The same micro-task queue that is used for Promises. There is no ABI-safe API to drain the micro-task queue inside of the addons. Even if we would have one, it could cause some unexpected side-effects because there can be other tasks besides object finalizers.

I believe that in future we can address this issue with the dedicated finalizer queue, but it is difficult to give any prediction if such change will be accepted and when it will be accepted. I used this approach to implement Node-API for Hermes JS engine because I did not have the V8 facility to run secondary finalizer callbacks Node-API for Hermes. I can create the PR and we will see how far it goes.

vmoroz avatar Feb 25 '22 23:02 vmoroz

This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made.

github-actions[bot] avatar Jul 16 '22 00:07 github-actions[bot]

We discussed today in the team meeting, the work on feature bits will help, so waiting on that to move this forward.

mhdawson avatar Oct 21 '22 15:10 mhdawson

Are there any news on possible solutions?

audetto avatar May 05 '23 19:05 audetto

@audetto , it was discussed today in the Node-API meeting.

Since now we have the Node-API versioning PR #45715 merged, I am going to continue working on the PR #42651. The plan is to change behavior of all Node-API finalizers starting with Node-API version 10. (The version 9 is going to be cut soon.) The new behavior will run finalizers as soon as they are called by GC. They will be not able to execute any JS-related code. There will be a new function node_api_set_immediate to execute JS related code in the next uv_loop tick. Any finalizers that require JS code execution will be able to use this function from the finalizers. This way they can simulate the current behavior. Obviously, this plan can be changed while we work on it.

Today the only solution that works is using Promises. They enable the finalizer execution in the event loop and the app can avoid huge spikes in unreleased native objects.

vmoroz avatar May 05 '23 22:05 vmoroz

Sounds good. Thanks for the update.

audetto avatar May 06 '23 09:05 audetto

With the landing of https://github.com/nodejs/node-addon-api/issues/1140, this issue should be resolved if you test against the Node.js nightly and turn on NAPI_EXPERIMENTAL. If your addon makes callbacks to JavaScript you may need to update it to use the new node_api_post_finalizer method.

Could you test that out to confirm it will address the issue you saw.

mhdawson avatar Oct 06 '23 15:10 mhdawson

Very good.

I have removed all await and stuff and now I can see the finalizer being called during the execution. The only thing I did was to define NAPI_EXPERIMENTAL. Everything has worked even via node_addon_api which I dont think need to care about it.

Moreover I am calling this function in the finaliser: napi_adjust_external_memory and I can see it working as well.

So thumbs-up from me.

audetto avatar Oct 29 '23 14:10 audetto