node icon indicating copy to clipboard operation
node copied to clipboard

src,tools: initialize cppgc

Open dharesign opened this issue 2 years ago • 24 comments

Initialize cppgc, create a CppHeap, and attach it to the Isolate.

This allows C++ addons to start using cppgc to manage objects.

Refs: https://github.com/nodejs/node/issues/40786

dharesign avatar Dec 01 '22 17:12 dharesign

Review requested:

  • [ ] @nodejs/startup

nodejs-github-bot avatar Dec 01 '22 17:12 nodejs-github-bot

In Electron, the cppgc heap will be initialized by Chromium. I don't think anything in this PR clashes with what Electron will want.

When they initialize node, they can use kNoInitializeCppgc here: https://github.com/electron/electron/blob/4e661842879ae0cb01a9fc251d56c8cabc958466/shell/app/node_main.cc#L161-L165

They don't use NodeMainInstance or CommonEnvironmentSetup.

CC: @codebytere / @nornagon for awareness

dharesign avatar Dec 12 '22 15:12 dharesign

FYI I believe we don't currently initialize cppgc in the main process in Electron (though Blink does of course initialize it). So we would want to be able to initialize Node with cppgc initialization in the main process (either by doing it ourselves or through this initialization option), and without cppgc initialization in the renderer process (since that would clash with Blink's initialization).

nornagon avatar Dec 12 '22 18:12 nornagon

Erp, I hit the close button by mistake. Please re-approve running the workflows >_<

dharesign avatar Dec 12 '22 18:12 dharesign

The previous test runs seemed to fail, and actually running my local build of node likewise fails. It seems to be due to a bug in v8:

https://github.com/v8/v8/blob/main/src/heap/cppgc/heap-base.cc#L104

I believe this should be delgate_ not delegate: the latter is moved from.

CC: @mlippautz

dharesign avatar Dec 12 '22 22:12 dharesign

The previous test runs seemed to fail, and actually running my local build of node likewise fails. It seems to be due to a bug in v8:

https://github.com/v8/v8/blob/main/src/heap/cppgc/heap-base.cc#L104

I believe this should be delgate_ not delegate: the latter is moved from.

CC: @mlippautz

Yep, changing that fixes my local build.

dharesign avatar Dec 12 '22 22:12 dharesign

The previous test runs seemed to fail, and actually running my local build of node likewise fails. It seems to be due to a bug in v8: https://github.com/v8/v8/blob/main/src/heap/cppgc/heap-base.cc#L104 I believe this should be delgate_ not delegate: the latter is moved from. CC: @mlippautz

Yep, changing that fixes my local build.

I added a second commit to this PR which fixes the above. We can switch the commit to cherry-pick the v8 commit once it gets fixed there. Hopefully the CI checks will proceed now.

dharesign avatar Dec 13 '22 02:12 dharesign

CI: https://ci.nodejs.org/job/node-test-pull-request/48450/

nodejs-github-bot avatar Dec 13 '22 15:12 nodejs-github-bot

Node is still segfaulting when I run it locally:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000100a5c14c node`cppgc::internal::MarkerBase::EnterAtomicPause(cppgc::EmbedderStackState) [inlined] cppgc::internal::SingleThreadedHandle::Cancel(this=<unavailable>) at task-handle.h:27:20 [opt]
   24
   25  	  void Cancel() {
   26  	    DCHECK(is_cancelled_);
-> 27  	    *is_cancelled_ = true;
   28  	  }
   29
   30  	  bool IsCanceled() const {
Target 0: (node) stopped.
(lldb) frame sel 1
frame #1: 0x0000000100a5c144 node`cppgc::internal::MarkerBase::EnterAtomicPause(this=0x0000000106808200, stack_state=kNoHeapPointers) at marker.cc:260:33 [opt]
   257 	    // Cancel remaining incremental tasks. Concurrent marking jobs are left to
   258 	    // run in parallel with the atomic pause until the mutator thread runs out
   259 	    // of work.
-> 260 	    incremental_marking_handle_.Cancel();
   261 	    heap().stats_collector()->UnregisterObserver(
   262 	        incremental_marking_allocation_observer_.get());
   263 	    incremental_marking_allocation_observer_.reset();
(lldb) print incremental_marking_handle_
(cppgc::internal::MarkerBase::IncrementalMarkingTaskHandle) $0 = {
  is_cancelled_ = nullptr {
    __ptr_ = 0x0000000000000000
  }
}

It looks like foreground_task_runner_ is nullptr, so MarkerBase::ScheduleIncrementalTaskRunner exits early before initializing incremental_marking_handle_. Later, MarkerBase::EnterAtomicPause assumes that incremental_marking_handle_ is set such that it can call .Cancel(), but it's not.

dharesign avatar Dec 13 '22 16:12 dharesign

Looks like there's no isolate_ set in the CppgcPlatformAdapter, which implies CppHeap::AttachIsolate has not been called, though it has.

After digging a bit, it's because CppHeap::AttachIsolate does static_cast<CppgcPlatformAdapter*>(platform()), where platform() in this case is actually PlatformWithPageAllocator. The static_cast is therefore invalid.

CC: @mlippautz

dharesign avatar Dec 13 '22 16:12 dharesign

Submitted the following v8 bugs:

  • https://bugs.chromium.org/p/v8/issues/detail?id=13589
  • https://bugs.chromium.org/p/v8/issues/detail?id=13590

dharesign avatar Dec 13 '22 16:12 dharesign

I added a commit to add SetIsolate as a virtual method on the cppgc::Platform protocol. Not sure if that's the direction v8 will go to fix the issue, but it should help get us past that issue and check the rest of the CI passes.

dharesign avatar Dec 13 '22 17:12 dharesign

Please re-approve running the workflows 😅

dharesign avatar Dec 13 '22 17:12 dharesign

OK I rebased and fixed some lint issues.

There was an outstanding question about what to do regarding the creation and attaching of the cpp heap. Currently it's being done in either the NodeMainInstance or CommonEnvironmentSetup::Impl. There was a suggestion to move it into a common method, but given the CppHeap needs to be owned, I don't see a good place to put it.

I think the current PR is a reasonable starting point. If an embedder wants to create a CppHeap, then they can either use their own id, or we can at that time figure out where to put this utility method (or perhaps just a way to get the id to use).

This PR does still have two commits which belong upstream in v8. One of them has landed in v8 11.1.119 (https://github.com/v8/v8/commit/22ef44b6553d8207aac5766c5fd7624a1e41fbc4). The other hasn't been addressed yet. I'm guessing these will want to be cherry-picked / applied outside of this PR? I have them here so the CI can run successfully.

dharesign avatar Jan 09 '23 18:01 dharesign

Looks like all CI passed apart from another lint issue. Will push a fix for that now.

dharesign avatar Jan 10 '23 17:01 dharesign

CI: https://ci.nodejs.org/job/node-test-pull-request/48956/

nodejs-github-bot avatar Jan 12 '23 15:01 nodejs-github-bot

I'm guessing these will want to be cherry-picked / applied outside of this PR? I have them here so the CI can run successfully.

Have you uploaded CLs for these to V8 yet? In general we need the fix to land in upstream first, and then backport them to the main branch. If a PR depends on the V8 fix, either we open a backport PR for that fix and then rebase the Node.js PR once that lands in deps/v8, or include the backport commit in the Node.js PR (personally I prefer the first because then it's easier to revert in case something goes wrong, but I guess for this PR it probably doesn't matter because we don't support cppgc before this anyway so the V8 fix could not affect us otherwise), see: https://github.com/nodejs/node/blob/main/doc/contributing/maintaining-V8.md#backporting-with-git-node-recommended

joyeecheung avatar Jan 27 '23 15:01 joyeecheung

I'm guessing these will want to be cherry-picked / applied outside of this PR? I have them here so the CI can run successfully.

Have you uploaded CLs for these to V8 yet? In general we need the fix to land in upstream first, and then backport them to the main branch. If a PR depends on the V8 fix, either we open a backport PR for that fix and then rebase the Node.js PR once that lands in deps/v8, or include the backport commit in the Node.js PR (personally I prefer the first because then it's easier to revert in case something goes wrong, but I guess for this PR it probably doesn't matter because we don't support cppgc before this anyway so the V8 fix could not affect us otherwise), see: https://github.com/nodejs/node/blob/main/doc/contributing/maintaining-V8.md#backporting-with-git-node-recommended

CLs no. I created issues:

  • Use-After-Move DCHECK:
    • https://bugs.chromium.org/p/v8/issues/detail?id=13589
    • Status: Fixed
      • https://github.com/v8/v8/commit/22ef44b6553d8207aac5766c5fd7624a1e41fbc4
    • Version: 11.1.119
  • Invalid static_cast:
    • https://bugs.chromium.org/p/v8/issues/detail?id=13590
    • Status: Open

dharesign avatar Jan 28 '23 01:01 dharesign

I pushed a new commit for "deps: V8 Fix Invalid static_assert" which should address the concern they have with the original patch I suggested. Hopefully we can then get it landed upstream.

dharesign avatar Jan 28 '23 02:01 dharesign

OK, I submitted a CL for the invalid static_cast issue and it just landed: https://github.com/v8/v8/commit/eea8a2f22bcf2bcfd12a093a0e1ba762b8d0a5ec

I can create a new PR to backport these two fixes now they've both landed in v8 then we can rebase this PR once that lands.

dharesign avatar Feb 17 '23 16:02 dharesign

This PR is rebased on top of #47065 to cherry pick the required fixes in v8. Once that PR lands I can rebase this PR on top of the merge, or they can be merged from this PR and we can close #47065. Whatever works!

dharesign avatar Mar 13 '23 14:03 dharesign

Rebased on top of main which now includes v8 11.3. That covers the commits we needed to land in v8 for this.

dharesign avatar Mar 31 '23 15:03 dharesign

Is this ready for re-review? Or is it still blocked on anything?

joyeecheung avatar Apr 04 '23 22:04 joyeecheung

Is this ready for re-review? Or is it still blocked on anything?

No blockers remain! Thanks

dharesign avatar Apr 05 '23 00:04 dharesign

I've rebased this on top of https://github.com/nodejs/node/pull/48660 so that we no longer need to worry about the wrapper ID. The CppHeap is now owned by the IsolateData if it's not already initialized (in the case of Blink's unsanboxed renderers, it's already initialized, then Node.js would recognize it and use the wrapper ID of that instead).

Regarding ABI compatibility, I think we can mark cppgc ABI compatibility as experimental for now, though so far the public part of the cppgc API already has a quite stable ABI compared to other V8 headers.

Regarding thread safety, since we are not creating any internal cppgc-managed objects, this is not yet our concern. Any internal object must explicitly implement the GarbageCollected interface to be cppgc-managed, so we could figure that out on a case-by-case basis if necessary when we actually start to migrate internal objects to the CppHeap - it's also fine to leave them the way they are (i.e. allocated outside the CppHeap), as all other memory management utils that we use (e.g. v8::Persistent) still work just fine with cppgc, so objects managed differently can live together alright - I believe this is also still the reality in Blink anyway. But at least this PR allows addons to create their own cppgc-managed objects.

I think this PR has no other outstanding questions for now other than that it needs https://github.com/nodejs/node/pull/48660.

joyeecheung avatar Jul 05 '23 19:07 joyeecheung

CI: https://ci.nodejs.org/job/node-test-pull-request/52622/

nodejs-github-bot avatar Jul 05 '23 19:07 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/52890/

nodejs-github-bot avatar Jul 21 '23 16:07 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/52892/

nodejs-github-bot avatar Jul 21 '23 17:07 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/52896/

nodejs-github-bot avatar Jul 21 '23 19:07 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/52941/

nodejs-github-bot avatar Jul 26 '23 15:07 nodejs-github-bot