kiwix-js icon indicating copy to clipboard operation
kiwix-js copied to clipboard

Issues with Sauce Labs - testing

Open Jaifroid opened this issue 2 years ago • 8 comments

With almost every commit on almost any PR I make with Kiwix JS, I have to re-run the cross-browser end-to-end tests, because they fail first time. They will usually pass second time, occasionally third time. When I look, it's usually a socket reset that's the problem.

Is this likely to be an issue with Sauce Labs, or an issue with some race condition in Kiwix JS? In the past, with some PRs there has been a race condition that is clearly the fault of new code in Kiwix JS (though it's sometimes taken weeks to find it). However, most of the time the errors occur for no apparent reason.

Jaifroid avatar Jul 17 '21 10:07 Jaifroid

I saw that too with my latest push on a PR. Annoying

mossroy avatar Jul 18 '21 08:07 mossroy

Seems to have stabilized since https://github.com/kiwix/kiwix-js/commit/5270bcf1550056b66f5e31b7c898f3eeb4e377c3 . I wonder if the "reundant search cycles" were causing instability/timing issues... Or maybe just coincidence: something fixed at Travis. Let's wait and see.

Jaifroid avatar Jul 19 '21 08:07 Jaifroid

Hmm, of course the PR shouldn't have been failing if the PR itself fixed anything....

Jaifroid avatar Jul 19 '21 08:07 Jaifroid

I spoke too soon. It just fell over again on the Polyfill PR and I had to re-run.

Jaifroid avatar Jul 19 '21 09:07 Jaifroid

I have to restart tests almost every other time. I wonder if we need to start looking for another solution? I don't have experience, but there must be something that runs natively on GitHub? Maybe we could start to build up new tests with a newer ZIM, while having the old ones still running until the new ones are as feature-complete...

Jaifroid avatar Jan 31 '22 22:01 Jaifroid

I saw that too. We could try increasing the timeout, as suggested in https://github.com/nightwatchjs/nightwatch/issues/1936#issuecomment-754203727

mossroy avatar Feb 06 '22 19:02 mossroy

Maybe we could look into (slowly) trying out some new testing framework. Jest (Unit), possibly in combination with Puppeteer (UI), look like popular frameworks where a lot of help is available. See https://jestjs.io/ and https://jestjs.io/docs/puppeteer. Just based on quick search, no proper research. They could run alongside our existing Unit Tests (for non-regression), but we could eventually stop using Sauce Labs for UI Tests (these are the ones that fail 50% of the time due to socket resets, our Unit Tests are fine).

Jaifroid avatar Feb 07 '22 08:02 Jaifroid

https://www.freecodecamp.org/news/continuous-integration-with-github-actions-and-puppeteer/

Jaifroid avatar Feb 07 '22 08:02 Jaifroid

Can you post a recent failing build and possible copy some of its output onto the issue here for later reference? (given GH Actions logs are quite ephemeral)

The question would be what specific socket this is about. Eg from which exact software package and version, and to which host name.

Krinkle avatar Nov 04 '22 12:11 Krinkle

Thanks @Krinkle. One that was typical was this one:

https://github.com/kiwix/kiwix-js/actions/runs/3335388590/jobs/5519308638

The salient errors are pasted below. The errors seem to correspond to this one: https://docs.saucelabs.com/dev/error-messages/index.html#the-sauce-labs-virtual-machine-failed-to-start-the-browser-or-device, and it is suggested there that it is either an incompatible version of Selenium, or else a request for an incompatible platform/OS/browser combination. I tried many things in #895, all of them failing on recent versions of Firefox, which seemed really counterintuitive given that we can run Firefox 45 with no error... Then when I "gave up" and decided just to disable the latest Firefox on SauceLabs (because we run in headless on GitHub actions anyway),I ran into a second layer of errors. Will summarise those in next post.

27 10 2022 07:32:46.914:WARN [launcher]: firefox 104  on SauceLabs has not captured in 60000 ms, killing.
27 10 2022 07:32:46.916:ERROR [SaucelabsLauncher]: Could not quit the Saucelabs selenium connection. Failure message:
27 10 2022 07:32:46.916:ERROR [SaucelabsLauncher]: TypeError: Cannot read property 'deleteSession' of null
    at SaucelabsLauncher.<anonymous> (/home/runner/work/***-js/***-js/node_modules/karma-sauce-launcher/dist/launcher/launcher.js:139:26)
    at Generator.next (<anonymous>)
    at /home/runner/work/***-js/***-js/node_modules/karma-sauce-launcher/dist/launcher/launcher.js:8:71
    at new Promise (<anonymous>)
    at __awaiter (/home/runner/work/***-js/***-js/node_modules/karma-sauce-launcher/dist/launcher/launcher.js:4:12)
    at SaucelabsLauncher.on (/home/runner/work/***-js/***-js/node_modules/karma-sauce-launcher/dist/launcher/launcher.js:131:31)
    at SaucelabsLauncher.emit (events.js:198:13)
    at SaucelabsLauncher.EventEmitter.emit (domain.js:448:20)
    at SaucelabsLauncher.emitAsync (/home/runner/work/***-js/***-js/node_modules/karma/lib/events.js:52:10)
    at SaucelabsLauncher.BaseLauncher.kill (/home/runner/work/***-js/***-js/node_modules/karma/lib/launchers/base.js:61:27)
2022-10-27T07:33:48.0[94](https://github.com/kiwix/kiwix-js/actions/runs/3335388590/jobs/5519308638#step:5:95)Z ERROR webdriver: Request failed with status 500 due to Error: Infrastructure Error -- The Sauce VMs failed to start the browser or device.
For more info, please check https://docs.saucelabs.com/dev/error-messages
2022-10-27T07:33:48.0[97](https://github.com/kiwix/kiwix-js/actions/runs/3335388590/jobs/5519308638#step:5:98)Z ERROR webdriver: Error: Infrastructure Error -- The Sauce VMs failed to start the browser or device.
For more info, please check https://docs.saucelabs.com/dev/error-messages
    at Object.getErrorFromResponseBody (/home/runner/work/***-js/***-js/node_modules/webdriver/build/utils.js:94:12)
    at WebDriverRequest._request (/home/runner/work/***-js/***-js/node_modules/webdriver/build/request.js:134:31)
27 10 2022 07:33:48.097:ERROR [SaucelabsLauncher]: Error: Failed to create session.
    at process._tickCallback (internal/process/next_tick.js:68:7)
Infrastructure Error -- The Sauce VMs failed to start the browser or device.
For more info, please check https://docs.saucelabs.com/dev/error-messages
    at Object.startWebDriverSession (/home/runner/work/***-js/***-js/node_modules/webdriver/build/utils.js:34:15)
    at process._tickCallback (internal/process/next_tick.js:68:7)
2022-10-27T07:33:53.656Z ERROR webdriver: Request failed with status 500 due to Error: Infrastructure Error -- The Sauce VMs failed to start the browser or device.
For more info, please check https://docs.saucelabs.com/dev/error-messages
2022-10-27T07:33:53.656Z ERROR webdriver: Error: Infrastructure Error -- The Sauce VMs failed to start the browser or device.
For more info, please check https://docs.saucelabs.com/dev/error-messages
    at Object.getErrorFromResponseBody (/home/runner/work/***-js/***-js/node_modules/webdriver/build/utils.js:94:12)
    at WebDriverRequest._request (/home/runner/work/***-js/***-js/node_modules/webdriver/build/request.js:134:31)
    at process._tickCallback (internal/process/next_tick.js:68:7)
27 10 2022 07:33:53.656:ERROR [SaucelabsLauncher]: Error: Failed to create session.
Infrastructure Error -- The Sauce VMs failed to start the browser or device.
For more info, please check https://docs.saucelabs.com/dev/error-messages
    at Object.startWebDriverSession (/home/runner/work/***-js/***-js/node_modules/webdriver/build/utils.js:34:15)
    at process._tickCallback (internal/process/next_tick.js:68:7)
27 10 2022 07:33:53.656:ERROR [launcher]: firefox [104](https://github.com/kiwix/kiwix-js/actions/runs/3335388590/jobs/5519308638#step:5:105)  on SauceLabs failed 2 times (failure). Giving up.
TOTAL: [132](https://github.com/kiwix/kiwix-js/actions/runs/3335388590/jobs/5519308638#step:5:133) SUCCESS
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! @ test-unit-saucelabs: `karma start tests/karma.conf.saucelabs.js`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the @ test-unit-saucelabs script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     /home/runner/.npm/_logs/2022-10-27T07_33_58_674Z-debug.log
Error: Process completed with exit code 1.

Jaifroid avatar Nov 04 '22 13:11 Jaifroid

Second layer of errors emerged with end-to-end tests after I disabled the latest Firefox. Seems to be Error [ERR_HTTP_HEADERS_SENT]: Cannot set headers after they are sent to the client. Example here: https://github.com/kiwix/kiwix-js/actions/runs/3392039979/jobs/5637806528, with whole log pasted below.

I suspect that a dependabot upgrade changed something about the configuration, but at this stage I no longer know what. What I do know is that we're running a very old version of nightwatch with lots of security issues. When I tried upgrading that, it made no difference to these errors.

Starting up http-server, serving .

http-server version: 14.1.1

http-server settings: 
CORS: disabled
Cache: 3600 seconds
Connection Timeout: 120 seconds
Directory Listings: visible
AutoIndex: visible
Serve GZIP Files: false
Serve Brotli Files: false
Default File Extension: none

Available on:
  http://127.0.0.1:8080
  http://10.1.0.220:8080
Hit CTRL-C to stop the server

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

[Fri Nov 04 2022 07:43:04 GMT+0000 (Coordinated Universal Time)]  "GET /" "curl/7.68.0"
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0_http_outgoing.js:470
    throw new ERR_HTTP_HEADERS_SENT('set');
    ^

Error [ERR_HTTP_HEADERS_SENT]: Cannot set headers after they are sent to the client
    at ServerResponse.setHeader (_http_outgoing.js:470:11)
    at module.exports.ResponseStream.(anonymous function) [as setHeader] (/home/runner/work/***-js/***-js/node_modules/union/lib/response-stream.js:100:34)
    at Object.exports.(anonymous function) (/home/runner/work/***-js/***-js/node_modules/http-server/lib/core/status-handlers.js:57:7)
    at Readable.stream.on (/home/runner/work/***-js/***-js/node_modules/http-server/lib/core/index.js:339:22)
    at Readable.emit (events.js:198:13)
    at emitErrorNT (internal/streams/destroy.js:91:8)
    at emitErrorAndCloseNT (internal/streams/destroy.js:59:3)
    at process._tickCallback (internal/process/next_tick.js:63:19)

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (52) Empty reply from server
   Timed out while waiting for element <#archiveFiles> to be present for 20000 milliseconds. - expected "visible" but got: "not found" (20385ms)
 chrome  [Nightwatch Runner] Test Suite 
       at Object.UI Tests (/home/runner/work/***-js/***-js/browser-tests/nightwatch_runner.js:27:14)
 chrome  ============================== 
 chrome  - Connecting to ondemand.saucelabs.com on port 80... 
 chrome  ℹ Connected to ondemand.saucelabs.com on port 80 (3286ms). 
 chrome  Using: chrome (107.0.5304.63) on Windows platform. 
 chrome  Results for: UI Tests 
 chrome  ✖ Timed out while waiting for element <#archiveFiles> to be present for 20000 milliseconds. - expected "visible" but got: "not found" (20[17](https://github.com/kiwix/kiwix-js/actions/runs/3335388590/jobs/5519308638#step:7:18)0ms) 
 chrome      at Object.UI Tests (/home/runner/work/***-js/***-js/browser-tests/nightwatch_runner.js:27:14) 
 chrome  ✖ chrome [Nightwatch Runner] UI Tests ([20](https://github.com/kiwix/kiwix-js/actions/runs/3335388590/jobs/5519308638#step:7:21).865s) 
 chrome  Timed out while waiting for element <#archiveFiles> to be present for 20000 milliseconds. - expected "visible" but got: "not found" (20170ms) 
 chrome  at Object.UI Tests (/home/runner/work/***-js/***-js/browser-tests/nightwatch_runner.js:27:14) 
 edge  [Nightwatch Runner] Test Suite 
 edge  ============================== 
 edge  - Connecting to ondemand.saucelabs.com on port 80... 
 edge  ℹ Connected to ondemand.saucelabs.com on port 80 (82[21](https://github.com/kiwix/kiwix-js/actions/runs/3335388590/jobs/5519308638#step:7:22)ms). 
 edge  Using: msedge (106.0.1370.47) on Windows platform. 
 edge  Results for: UI Tests 
 edge  ✖ Timed out while waiting for element <#archiveFiles> to be present for 20000 milliseconds. - expected "visible" but got: "not found" (20385ms) 
 edge      at Object.UI Tests (/home/runner/work/***-js/***-js/browser-tests/nightwatch_runner.js:27:14) 
 edge  ✖ edge [Nightwatch Runner] UI Tests (21.298s) 
 edge  Timed out while waiting for element <#archiveFiles> to be present for 20000 milliseconds. - expected "visible" but got: "not found" (20385ms) 
 edge  at Object.UI Tests (/home/runner/work/***-js/***-js/browser-tests/nightwatch_runner.js:27:14) 
_________________________________________________

TEST FAILURE:  2 assertions failed, 0 passed (30.[24](https://github.com/kiwix/kiwix-js/actions/runs/3335388590/jobs/5519308638#step:7:25)4s)

 ✖ nightwatch_runner
 – UI Tests (21.[29](https://github.com/kiwix/kiwix-js/actions/runs/3335388590/jobs/5519308638#step:7:30)8s)

Error: Process completed with exit code 5.

Jaifroid avatar Nov 04 '22 13:11 Jaifroid

Looking at our commit log, it's easy to see where the tests began to fail systematically: https://github.com/kiwix/kiwix-js/commits/master, specifically the last "good" commit was https://github.com/kiwix/kiwix-js/commit/9a6e6f7add641084860a5f12446b29c999475589. The next commit after that didn't touch the test configuration though, it seemed to fail on its own, making me suspect a temporary issue at SauceLabs.

Jaifroid avatar Nov 04 '22 14:11 Jaifroid

This is connected to #743. Superseded by #907.

Jaifroid avatar Jun 15 '23 15:06 Jaifroid