node icon indicating copy to clipboard operation
node copied to clipboard

test-snapshot-reproducible is failing in the main branch

Open joyeecheung opened this issue 1 year ago • 8 comments

The snapshot reproducibility test has been failing in dynamically linked builds since yesterday. From CI history it was still green at cd8e61fe2630906fc53de6ff9b98ef8355533785 (https://ci.nodejs.org/job/node-test-commit-linux-containered/44116/), but started failing no later than 4c730aed7f825af1691740663d599e9de5958f89 (https://ci.nodejs.org/job/node-test-commit-linux-containered/44117/), that means either one of the two commits in https://github.com/nodejs/node/compare/cd8e61fe2630906fc53de6ff9b98ef8355533785...4c730aed7f825af1691740663d599e9de5958f89 caused the regression, or some infra update happened yesterday caused the regression.

joyeecheung avatar Jun 25 '24 08:06 joyeecheung

Started three builds to see which commit introduced the regression or whether it was caused by infra:

https://ci.nodejs.org/job/node-test-commit-linux-containered/44134/ https://ci.nodejs.org/job/node-test-commit-linux-containered/44135/ https://ci.nodejs.org/job/node-test-commit-linux-containered/44136/

joyeecheung avatar Jun 25 '24 08:06 joyeecheung

Looks like 4c730aed7f825af1691740663d599e9de5958f89 broke the test. Opened to revert it for now to make the CI green https://github.com/nodejs/node/pull/53582.

joyeecheung avatar Jun 25 '24 09:06 joyeecheung

I haven't been able to reproduce this locally yet, but from the logs:

20:38:52     + [
20:38:52     +   {
20:38:52     +     offset: '0x40',
20:38:52     +     slice1: '000000b30b6c5ab8eea9aa31322e342e',
20:38:52     +     slice2: '000000fe0a668db8eea9aa31322e342e'
20:38:52     +   },
20:38:52     +   {
20:38:52     +     offset: '0x101640',
20:38:52     +     slice1: 'ec805d44660000000000000000a028c0',
20:38:52     +     slice2: 'ec805d44660000000000000000a03886'
20:38:52     +   },
20:38:52     +   {
20:38:52     +     offset: '0x101650',
20:38:52     +     slice1: 'd4655500000000000000000000000000',
20:38:52     +     slice2: '8c215600000000000000000000000000'
20:38:52     +   }
20:38:52     + ]

I suspect the unreproducibility comes from the hash of the flags - at least, it seems to be in the header of the main context snapshot.

joyeecheung avatar Jun 25 '24 17:06 joyeecheung

I haven't been able to reproduce this in a "normal" build (Linux) but can with https://github.com/nodejs/node/commit/4c730aed7f825af1691740663d599e9de5958f89 when building with configure --shared-openssl.

richardlau avatar Jun 25 '24 17:06 richardlau

I am still unable to reproduce it with ./configure --shared-openssl or ./configure --with-intl=small-icu, neither on macOS nor on Ubuntu 23.04...but I could try logging into one of the containers to debug.

joyeecheung avatar Jun 26 '24 15:06 joyeecheung

Ah, I could reproduce it now, it needs a separately installed shared openssl.

joyeecheung avatar Jun 26 '24 19:06 joyeecheung

I can reproduce this on a local docker ubuntu container:

$ uname -a
Linux e3f5b61fd0a2 6.6.31-linuxkit #1 SMP Thu May 23 08:36:57 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
$ ./configure --shared-openssl --shared-openssl-libpath=/usr/lib/aarch64-linux-gnu/ --shared-openssl-includes=/usr/include
$ make
$ ./out/Release/node test/parallel/test-snapshot-reproducible.js
0x0: Write magic 143da19
0x4: Write metadata
0x39: Write snapshot blob
0x185085: Write IsolateDataSerializeInfo
0x186538: Write EnvSerializeInfo
0x1873d1: Write CodeCacheInfo
0x0: Write magic 143da19
0x4: Write metadata
0x39: Write snapshot blob
0x185085: Write IsolateDataSerializeInfo
0x186538: Write EnvSerializeInfo
0x1873d1: Write CodeCacheInfo
node:assert:126
  throw new AssertionError(obj);
  ^

AssertionError [ERR_ASSERTION]: Expected values to be strictly deep-equal:
+ actual - expected

+ [
+   {
+     offset: '0x40',
+     slice1: '0001000000f63cf2004cb3bf2331322e',
+     slice2: '00010000004a3dfd284cb3bf2331322e'
+   },
+   {
+     offset: '0x111680',
+     slice1: '805d44660000000000000000707179fc',
+     slice2: '805d4466000000000000000000b9fbf6'
+   }
+ ]
- []
    at Object.<anonymous> (/workspace/test/parallel/test-snapshot-reproducible.js:69:8)
    at Module._compile (node:internal/modules/cjs/loader:1467:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1551:10)
    at Module.load (node:internal/modules/cjs/loader:1282:32)
    at Module._load (node:internal/modules/cjs/loader:1098:12)
    at TracingChannel.traceSync (node:diagnostics_channel:315:14)
    at wrapModuleLoad (node:internal/modules/cjs/loader:215:24)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:158:5)
    at node:internal/main/run_main_module:30:49 {
  generatedMessage: true,
  code: 'ERR_ASSERTION',
  actual: [
    {
      offset: '0x40',
      slice1: '0001000000f63cf2004cb3bf2331322e',
      slice2: '00010000004a3dfd284cb3bf2331322e'
    },
    {
      offset: '0x111680',
      slice1: '805d44660000000000000000707179fc',
      slice2: '805d4466000000000000000000b9fbf6'
    }
  ],
  expected: [],
  operator: 'deepStrictEqual'
}

Node.js v23.0.0-pre

legendecas avatar Jun 26 '24 21:06 legendecas

Locally this patch fixes it for me (it is rather curious why this only shows up in dynamically linked builds, though..): https://chromium-review.googlesource.com/c/v8/v8/+/5662576 Trying it in the CI: https://ci.nodejs.org/job/node-test-commit-linux-containered/44167/

joyeecheung avatar Jun 27 '24 03:06 joyeecheung