node icon indicating copy to clipboard operation
node copied to clipboard

async-hooks.test-emit-after-on-destroyed is flaky

Open anonrig opened this issue 1 year ago • 6 comments

Test

async-hooks.test-emit-after-on-destroyed

Platform

Other

Console output

not ok 32 async-hooks/test-emit-after-on-destroyed
  ---
  duration_ms: 403.10800
  severity: fail
  exitcode: 1
  stack: |-
    node:assert:125
      throw new AssertionError(obj);
      ^
    
    AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:
    
    null !== 1
    
        at ChildProcess.<anonymous> (/home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/test/async-hooks/test-emit-after-on-destroyed.js:56:12)
        at ChildProcess.<anonymous> (/home/iojs/build/workspace/node-test-commit-aix/nodes/aix72-ppc64/test/common/index.js:476:15)
        at ChildProcess.emit (node:events:515:28)
        at maybeClose (node:internal/child_process:1105:16)
        at Socket.<anonymous> (node:internal/child_process:457:11)
        at Socket.emit (node:events:515:28)
        at Pipe.<anonymous> (node:net:337:12) {
      generatedMessage: true,
      code: 'ERR_ASSERTION',
      actual: null,
      expected: 1,
      operator: 'strictEqual'
    }
    
    Node.js v21.0.0
  ...

Build links

  • https://ci.nodejs.org/job/node-test-commit-aix/48636/

Additional information

No response

anonrig avatar Oct 18 '23 14:10 anonrig

Seems to be a child process issue not async hooks. The assert checks the exit code given via the childprocess close event which is a number according to docs but here it is null.

Flarna avatar Oct 18 '23 15:10 Flarna

Seems to be a child process issue not async hooks. The assert checks the exit code given via the childprocess close event which is a number according to docs but here it is null.

Usually the exit code being null means that the child was ended by a signal.

richardlau avatar Oct 18 '23 18:10 richardlau

Similar test is flaky as well: https://github.com/nodejs/node/issues/50262

anonrig avatar Oct 18 '23 21:10 anonrig

I started to investigate this issue last week.

I asked @mhdawson to spin up some stress tests

ref: https://ci.nodejs.org/job/node-stress-single-test/nodes=rhel8-ppc64le/469/console

We ran this test case 1000 times on rhel8 ppc64le. This did not reproduce the error though.

I then built Node.js 21.0.0 (The version of node at the time of the issue) on my local machine (Fedora 39). From there I ran this test 100k times and I still was not able to reproduce a test failure.

$ tools/test.py -j 16 --repeat=100000 async-hooks/test-emit-after-on-destroyed
[14:17|% 100|+ 100000|-   0]: Done                               

All tests passed.

As noted before in https://github.com/nodejs/node/issues/50245#issuecomment-1769124404, the failure occurs due to a signal be raised but will need find a way to reproduce the failure to get more info on the signal and why the test case is getting signaled.

@mhdawson

Any further suggestions on how to reproduce the error?

Maybe it occurs more when running all the test cases together?

OR

Maybe it presents itself more frequently on some platforms? The original issue indicates it occurred on ppc64 AIX. Maybe stress testing AIX would reproduce the error.

abmusse avatar Jan 29 '24 17:01 abmusse

@abmusse I think trying the stress test on one of the platforms where we saw the failure makes sense. I think that @richardlau mentioned you still have access to one of the AIX machines from an earlier investigation so trying the 100k run there would be a good next step.

mhdawson avatar Jan 29 '24 22:01 mhdawson

@mhdawson

Today I ran 100k stress test on one our AIX machines.

$ tools/test.py --repeat=100000 async-hooks/test-emit-after-on-destroyed
[59:37|% 100|+ 100000|-   0]: Done   

Running it 100k times on AIX didn't reproduce the error.

abmusse avatar Feb 02 '24 21:02 abmusse

I suggest we un-mark this test as flaky as running it 100k times did not reproduce the error.

I will keep an eye on it and if it returns to a flaky state will handle marking it as flaky again.

abmusse avatar Mar 05 '24 20:03 abmusse