playwright
playwright copied to clipboard
feat: implement flag to fail flaky tests
Implements feature requested in https://github.com/microsoft/playwright/issues/30457
The test runner treats flaky tests as failures when the flag is enabled, but still reports flaky tests as flaky in the reporting interface. It feels like something worth discussing as this behaviour makes sense to me, but looked a bit odd to @BJSS-russell-pollock when I ran this past him.
@microsoft-github-policy-service agree company="BJSS"
Test results for "tests 1"
2 flaky
:warning: [playwright-test] › ui-mode-test-watch.spec.ts:145:5 › should watch all:warning: [webkit-library] › library/browsercontext-clearcookies.spec.ts:116:3 › should remove cookies by path
27252 passed, 671 skipped :heavy_check_mark::heavy_check_mark::heavy_check_mark:
Merge workflow run.
Talking point of the failing on flaky behaviour, is about the reporter not indicating that the suite failed because of the flaky test. The exit code is 1 but the reporter doesn't look like it.
To try this out, I set the the playwright.config to:
retries: process.env.CI ? 2 : 5,
failOnFlakyTests: true,
And then the test:
test('flaky test', async () => {
const randomFail = Math.round(Math.random());
expect(randomFail).toBe(0)
});
When running the test through the CLI - the retry would pass and then mark the test as flaky:
Hi there,
The intended behaviour from this PR, and my understanding of the issue matches yours - any changes in the code extra to that reflect my unfamiliarity with the codebase.
I think Russell's comment / the discussion around behaviour reflects the user / manual tester perspective where the change in exit code isn't immediately visible.
The intended behaviour from this PR, and my understanding of the issue matches yours - any changes in the code extra to that reflect my unfamiliarity with the codebase.
Sounds great! In this case, I think we should plumb the new CLI option similar to --pass-with-no-tests
. Search for passWithNoTests
across the codebase for the reference. And this is a place that determines whether everything was successful or something has failed, and it needs to be tweaked.
@BJSS-russell-pollock I think this should be clear since reporter shows the "Flaky" count as non-zero. I'd assume anyone using this option will know that flakiness means "failed test run", because the behavior was opt in. Let me know what you think.
@BJSS-russell-pollock I think this should be clear since reporter shows the "Flaky" count as non-zero. I'd assume anyone using this option will know that flakiness means "failed test run", because the behavior was opt in. Let me know what you think.
Absolutely makes sense.
Sounds great! In this case, I think we should plumb the new CLI option similar to
--pass-with-no-tests
. Search forpassWithNoTests
across the codebase for the reference. And this is a place that determines whether everything was successful or something has failed, and it needs to be tweaked.
Excellent - I'll have a look and update it accordingly.
Test results for "tests 1"
1 flaky
:warning: [webkit-page] › page/workers.spec.ts:243:3 › should support offline27321 passed, 661 skipped :heavy_check_mark::heavy_check_mark::heavy_check_mark:
Merge workflow run.
Test results for "tests 1"
693 passed :heavy_check_mark::heavy_check_mark::heavy_check_mark:
Merge workflow run.
Test results for "tests 1"
27328 passed, 661 skipped :heavy_check_mark::heavy_check_mark::heavy_check_mark:
Merge workflow run.
Test results for "tests 1"
15 passed :heavy_check_mark::heavy_check_mark::heavy_check_mark:
Merge workflow run.
Okay, reworked as suggested - much less additional code!
Test results for "tests 1"
3 failed :x: [playwright-test] › playwright.ct-react.spec.ts:253:5 › should pass "key" attribute from JSX in variable :x: [installation tests] › playwright-electron-should-work.spec.ts:18:5 › electron should work :x: [installation tests] › playwright-test-package-managers.spec.ts:54:5 › npm: uninstalling ct removes playwright bin
27325 passed, 661 skipped :heavy_check_mark::heavy_check_mark::heavy_check_mark:
Merge workflow run.
Test results for "tests 1"
1 failed :x: [playwright-test] › runner.spec.ts:118:5 › should ignore subprocess creation error because of SIGINT
2 flaky
:warning: [playwright-test] › ui-mode-test-watch.spec.ts:184:5 › should watch new file:warning: [webkit-library] › library/browsercontext-clearcookies.spec.ts:52:3 › should remove cookies by name
27330 passed, 662 skipped :heavy_check_mark::heavy_check_mark::heavy_check_mark:
Merge workflow run.
Test results for "tests 1"
2 flaky
:warning: [chromium-library] › library/trace-viewer.spec.ts:908:1 › should display waitForLoadState even if did not wait for it:warning: [playwright-test] › ui-mode-test-watch.spec.ts:96:5 › should batch watch updates
27331 passed, 662 skipped :heavy_check_mark::heavy_check_mark::heavy_check_mark:
Merge workflow run.
@dgozman This should be ok for re-review when you're ready - the failed tests in previous pipeline runs look like noise to me, but happy to dig if you think it's deeper
Test results for "tests 1"
1 failed :x: [playwright-test] › playwright.ct-react.spec.ts:253:5 › should pass "key" attribute from JSX in variable
2 flaky
:warning: [playwright-test] › ui-mode-test-screencast.spec.ts:21:5 › should show screenshots:warning: [playwright-test] › ui-mode-test-watch.spec.ts:96:5 › should batch watch updates
27346 passed, 662 skipped :heavy_check_mark::heavy_check_mark::heavy_check_mark:
Merge workflow run.