node icon indicating copy to clipboard operation
node copied to clipboard

perf_hooks: fix gc elapsed time

Open theanarkh opened this issue 2 years ago • 2 comments

fix gc elapsed time Refs: https://github.com/nodejs/node/issues/44046

  • [x] make -j4 test (UNIX), or vcbuild test (Windows) passes
  • [x] tests and/or benchmarks are included
  • [x] documentation is changed or added
  • [x] commit message follows commit guidelines

Affected subsystem: perf_hooks

theanarkh avatar Jul 30 '22 19:07 theanarkh

CI: https://ci.nodejs.org/job/node-test-pull-request/45770/

nodejs-github-bot avatar Aug 01 '22 10:08 nodejs-github-bot

@bnoordhuis Hi, can you help review this PR ? Thanks !

theanarkh avatar Aug 06 '22 15:08 theanarkh

CI: https://ci.nodejs.org/job/node-test-pull-request/45965/

nodejs-github-bot avatar Aug 10 '22 11:08 nodejs-github-bot

@mcollina Hi, can you help review this PR ? Or can i merge it ?

theanarkh avatar Aug 10 '22 13:08 theanarkh

Can you articulate this? I don't see a test.

mcollina avatar Aug 10 '22 13:08 mcollina

It is difficult to trigger, i do not know how to write a test. The problem is Node.js assumes that GC is performed like:

  1. GC type A start.
  2. GC type A end.
  3. GC type B start.
  4. GC type B end.

But in fact it may be:

  1. GC type A start.
  2. GC type B start(such as kGCTypeProcessWeakCallbacks).
  3. GC type B end.
  4. GC type A end.

This makes the result wrong. Maybe i block this pr until i find a way to test it. related: https://github.com/nodejs/node/issues/44046.

theanarkh avatar Aug 10 '22 13:08 theanarkh

Seems ok to land as is.

mcollina avatar Aug 10 '22 14:08 mcollina

Landed in eb7f22a6b420f3f4a401f5b9fc66f8cd5e746dd9

nodejs-github-bot avatar Aug 10 '22 16:08 nodejs-github-bot

AFAICS this appears to be a rather questionable change:

  • Per this comment in the associated bug, this would have the effect of completely hiding kGCTypeProcessWeakCallbacks performance events, which would cause breakage for anyone who cares about them.
  • Per my reading of the V8-side source, the originally reported problem can't actually happen with current V8 (because the isolate()->global_handles()->InvokeSecondPassPhantomCallbacks() call, which triggers the kGCTypeProcessWeakCallbacks callback, happens before CallGCPrologueCallbacks(gc_type, kNoGCCallbackFlags)), which would mean that this patch is effectively a no-op and just adds unneeded complexity. Maybe this was different at some point in the past.

Of course it's possible that I'm missing something, but the fact that there are no tests here and just a vague "it's hard to repro" statement doesn't exactly instill confidence that this patch is actually fixing anything.

jakobkummerow avatar Oct 23 '22 14:10 jakobkummerow

@jakobkummerow hey, thank you for the feedback.

I'd like to point out that the case this PR tried to avoid is that second-pass callbacks can be invoked synchronously by conditions, see https://source.chromium.org/chromium/chromium/src/+/main:v8/src/handles/global-handles.cc;l=1250?q=GlobalHandles::PostGarbageCollectionProcessing for details. This step is nested in the CallGCPrologueCallbacks and CallGCEpilogueCallbacks, in Heap::CollectGarbage.

You can reproduce the problem with this simple repo: https://github.com/legendecas/v8-nested-gc-callbacks.

The problem for Node.js to test this issue is that it is hard to determine the "expected" starting time of the GC. During the GC callbacks, it is not allowed to call into javascript again. So it is pretty hard to test this behavior in a reasonable way. It is always welcome to contribute a test case for this!

legendecas avatar Oct 23 '22 16:10 legendecas