node
node copied to clipboard
Investigate flaky test-http-server-request-timeouts-mixed
Test
test-http-server-request-timeouts-mixed
Platform
macos
Console output
not ok 1173 parallel/test-http-server-request-timeouts-mixed
---
duration_ms: 3.244
severity: fail
exitcode: 1
stack: |-
node:assert:399
throw err;
^
AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:
assert(request2.completed)
at Timeout._onTimeout (/Users/iojs/build/workspace/node-test-commit-osx/nodes/osx11-x64/test/parallel/test-http-server-request-timeouts-mixed.js:106:5)
at listOnTimeout (node:internal/timers:564:17)
at process.processTimers (node:internal/timers:507:7) {
generatedMessage: true,
code: 'ERR_ASSERTION',
actual: false,
expected: true,
operator: '=='
}
Node.js v19.0.0-pre
...
Build links
Failed PR
10 (nodejs/node#43329, nodejs/node#43380, nodejs/node#43366, nodejs/node#43455, nodejs/node#43190, nodejs/node#43176, nodejs/node#43374, nodejs/node#43363, nodejs/node#43417, nodejs/node#43216)
Appeared
test-orka-macos11-x64-1, test-nearform-macos10.15-x64-3, test-orka-macos10.15-x64-2, test-nearform-macos10.15-x64-1, test-orka-macos11-x64-2
First CI
https://ci.nodejs.org/job/node-test-pull-request/44587/
Last CI
https://ci.nodejs.org/job/node-test-pull-request/44664/
Example
https://ci.nodejs.org/job/node-test-commit-osx/nodes=osx11-x64/45608/console
Additional information
Build links info is from CI Reliability 2022-06-18
This test was introduced on 2022-05-03: https://github.com/nodejs/node/pull/42893
This test starts flaky from 2022-05-04: https://github.com/nodejs/reliability/issues/271
Query all flaky dates: https://github.com/nodejs/reliability/issues?q=is%3Aissue+is%3Aopen+test-http-server-request-timeouts-mixed
cc @ShogunPanda(Test author) @lpinca(Reviewer, I saw you mention the flaky possibility in pr comment)
It seems this only occurs on macOS but I am not able to reproduce locally on macOS 12.4.
@ShogunPanda could d you take a look?
Yup, I will take a look on Monday!
It seems this only occurs on macOS but I am not able to reproduce locally on macOS 12.4.
Yes, I also tried to reproduce on my local MacOS 12.4 but never reproduced.
I was able to repro on Mac 12.4:
=== release test-http-server-request-timeouts-mixed ===
Path: parallel/test-http-server-request-timeouts-mixed
node:assert:400
throw err;
^
AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:
assert(request2.completed)
at Timeout._onTimeout (/Users/nicksia/Documents/git/node/test/parallel/test-http-server-request-timeouts-mixed.js:107:5)
at listOnTimeout (node:internal/timers:564:17)
at process.processTimers (node:internal/timers:507:7) {
generatedMessage: true,
code: 'ERR_ASSERTION',
actual: false,
expected: true,
operator: '=='
}
Let me know if there's anything I can do to help, even if it's just being a lab rat :)
Thanks sir! Do you mind sharing which machine (hardware I mean) were you running on and all the relevant environment information (like if you were on battery or if your machine was overloaded and so forth)
Sure thing - here's a screenshot of my system details:
My machine was plugged into the wall.
Test was running in CLion's integrated terminal.
I'd say "low-medium" utilisation, had Postgres running in a Docker image and a IDE/Browser open (about 20% CPU usage and 20GB memory used).
Update: Fails even with request2
isolated.
Here's some awfully crude dumps (sorry!), printing out the value of request2.completed
every 100ms with setInterval.
Failed Run:
@[101ms] request2.completed: undefined
@[201ms] request2.completed: undefined
@[303ms] request2.completed: undefined
request2 client created.
request2.client.write()
@[413ms] request2.completed: false
<snip>
@[2840ms] request2.completed: false
node:assert:400
throw err;
^
AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:
assert(request2.completed)
at Timeout._onTimeout (/Users/nicksia/Documents/git/node/test/parallel/test-http-server-request-timeouts-mixed.js:81:5)
at listOnTimeout (node:internal/timers:564:17)
at process.processTimers (node:internal/timers:507:7) {
generatedMessage: true,
code: 'ERR_ASSERTION',
actual: false,
expected: true,
operator: '=='
}
Node.js v19.0.0-pre
Passing Run
@[102ms] request2.completed: undefined
@[203ms] request2.completed: undefined
@[304ms] request2.completed: undefined
request2 client created.
request2.client.write()
@[412ms] request2.completed: false
<snip>
@[2439ms] request2.completed: false
@[2540ms] request2.completed: true
<snip>
@[6481ms] request2.completed: true
Closing server.
My passing runs are close to 2500ms after client is created, when the headersTimeout * 1.2 + connectionsCheckingInterval
on my platform is 2900ms.
Could it be that the check is just too early on slow/burdened systems?
Update: I'm getting a bunch of runs around the 3000ms mark when I remove assert(request2.completed)
:
@[3039ms] request2.completed: true
The times don't quite match up, which is causing this issue.
headersTimeout: 2000ms
requestTimeout: 4000ms
connectionsCheckingInterval: 500ms
Request 2 is started at headersTimeout * 0.2
, or 400ms
.
Headers time out 2000ms after that.
Completion of Request 2 is checked at at headersTimeout * 1.2 + connectionsCheckingInterval
, or 2900ms
.
Timeline
@[400ms]: Request 2 client created and first write sent.
@[2400ms]: Headers are timed out.
@[2900ms]: Request 2 completion is checked (which is intended to depend on Request 2 expiring via headersTimeout, as seen in the comment - assert(request2.response.startsWith(responseTimeout)); // It is expired due to headersTimeout
.
The problem is that the connectionsCheckingInterval is slightly too high. The connection checking interval could tick just after 2400ms, and then just after 2900ms.
Example
@[2424ms] request2.completed: false
checking connection timeout
@[2525ms] request2.completed: false
@[2625ms] request2.completed: false
@[2727ms] request2.completed: false
@[2828ms] request2.completed: false
node:assert:400
throw err;
^
AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:
assert(request2.completed)
at Timeout._onTimeout (/Users/nicksia/Documents/git/node/test/parallel/test-http-server-request-timeouts-mixed.js:108:5)
at listOnTimeout (node:internal/timers:564:17)
at process.processTimers (node:internal/timers:507:7) {
generatedMessage: true,
code: 'ERR_ASSERTION',
actual: false,
expected: true,
operator: '=='
}
The fix is to ensure connection checking interval is less than 500ms, and to decouple the check timeout from connection checking interval.
I'll put up a fix for this shortly.
Failed again here - https://ci.nodejs.org/job/node-test-commit-osx/49021/nodes=osx11-x64/testReport/junit/(root)/test/parallel_test_http_server_request_timeouts_mixed_/
Failed on Ubuntu 22 - https://ci.nodejs.org/job/node-test-commit-linuxone/40388/
Failed on Ubuntu 22 - https://ci.nodejs.org/job/node-test-commit-linuxone/40388/
No, this one failed on rhel8-s390x.
Seen this several times on osx - https://ci.nodejs.org/job/node-test-commit-osx-arm/nodes=osx11/14659/ as an example