Feature request: loop test run until failure for watch mode
Clear and concise description of the problem
sometime we have a flaky test in the suite. We think we have made a change which could fix the flaky spec, but how do we verify this?
You need to run the test like 100x in a row or something. With the current watch mode that would entail hitting enter key manually 100x.
Suggested solution
add run until failure mode to watch usage screen:

then you can just enable this, mode grab a :coffee: and come back in a few minutes refreshed to see if the flakyness is fixed or not
Alternative
manually hitting enter key 100x
Additional context
No response
Validations
- [X] Follow our Code of Conduct
- [X] Read the Contributing Guidelines.
- [X] Read the docs.
- [X] Check that there isn't already an issue that request the same feature to avoid creating a duplicate.
@sheremet-va if I were to open a PR for this, would it get considered or is this feature something vitest does not even want to include?
@sheremet-va if I were to open a PR for this, would it get considered or is this feature something vitest does not even want to include?
Feel free to add this feature 👍🏻
@sheremet-va I could work on this if it is still desired. But to fully understand this, basically, we want to implement "rerun all tests" but n times, right? So it could be a new option before the quit one named "n" which when pressed prompts the user to enter a number which is the n times the tests will be rerun for.
UPDATE: Should this be for all tests or failed tests only?
I would prefer to have also option to keep rerunning until first failure. Specifying a number of reruns is nice to have but I often don't know how many reruns will be needed.
Hmm ... So you are suggesting that pressing the option "n" keeps running n times or until failure? This should be the behavior of my suggested option, but not a separate option. Are we on the same page @capaj?
Almost on the same page. I guess I can just pass arbitrary high numbers like 10000 for example to make sure it runs long enough.
I guess what you are looking for is a different option, say i, that runs forever or until failure. But it has to be monitored because there is a chance that tests will never fail, and just take processing space forever.
There is PR for "repeat" mode: https://github.com/vitest-dev/vitest/pull/2652
Maybe once it's merged you can build your solution on top of it.
@sheremet-va I would love that. I'll subscribe for changes for when it is merged, and also feel free to ping me to work on this once the repeat mode PR is merged because I might forget. Meanwhile, got anything for me to work on?
I would also use this functionality for fixing flaky tests.
The Ginkgo test framework for Golang has a flag called --until-it-fails which runs forever until a test fails.
I use this bash script on Ubuntu WSL2 for running cypress tests until failure, the same can be used for vitest:
#!/bin/bash
# use this script to run tests until failure
# start testing until failure
while true
do
# replace --spec with your desired spec
npm run test-e2e -- --spec="cypress/e2e/auth/login.spec.js"
# or to run all tests, use:
# npm run test-e2e
if [ $? -ne 0 ]; then
echo "Error found!"
# test failed, exit process, stop testing
exit 1
fi
# test didn't fail - rerun the test again
done
Is there any update on this one? This would be a really useful feature to have within Vitest and the UI, instead of relying on some sort of script. Thanks
I've been dealing a lot with flaky tests, here are some things to consider:
- To verify if test is fixed, I have "run until failure" script, which will run test until it fails
- To verify if flakiness is reduced/increased - I have "failure rate" script, which will continuously run the same test, tracking failure rate
- To find the most flaky tests - I have "failure rates" script, it will continuously run all tests and show failure rate for each
- It's also fairly important to keep logs for all runs, but especially failures, so that it can be analysed later on
Here's my "failure rate" bash script:
#!/bin/bash
# use this script to figure out failure rate of a test
# keep count of how many times the tests have been run
count=0
# keep count of failures
failures=0
# start testing until failure
while true
do
# replace --spec with your desired spec
# show output:
# npm run test-e2e -- --spec="cypress/e2e/some.spec.js"
# hide output:
# npm run test-e2e -- --spec="cypress/e2e/some.spec.js" > /dev/null 2>&1
# append output to log.txt:
npm run test-e2e -- --spec="cypress/e2e/compliance/some.spec.js" >> cypress/logs/log.txt 2>&1 >/dev/null
# run multiple tests:
# npm run test-e2e -- --spec="cypress/e2e/some1.spec.js" > /dev/null 2>&1 && \
# npm run test-e2e -- --spec="cypress/e2e/some2.spec.js" > /dev/null 2>&1 && \
# npm run test-e2e -- --spec="cypress/e2e/some3.spec.js" > /dev/null 2>&1 && \
# npm run test-e2e -- --spec="cypress/e2e/some4.spec.js" > /dev/null 2>&1 && \
# npm run test-e2e -- --spec="cypress/e2e/some5.spec.js" > /dev/null 2>&1 && \
# npm run test-e2e -- --spec="cypress/e2e/some6.spec.js" > /dev/null 2>&1
# or to run all tests, use:
# npm run test-e2e
if [ $? -ne 0 ]; then
# test failed, increment the failure count
failures=$((failures+1))
fi
# increment the count
count=$((count+1))
# print failure rate
echo "Failure rate: $failures/$count = $(echo "scale=2; $failures * 100 / $count" | bc | awk '{printf "%d\n", ($0 < 0 ? $0-0.5 : $0+0.5)}')%"
done
And here's my "failure rates" CJS script, it's for cypress, but could be adapted for vitest:
const { getSpecs } = require('find-cypress-specs');
const { spawn } = require('child_process');
const fs = require('fs');
const specs = getSpecs();
const specStats = new Map(specs.map((spec) => [spec, []]));
fs.writeFileSync('cypress/logs/log.txt', `\nNew run, at ${new Date().toISOString()}\n`, { flag: 'a' });
const runSpec = (spec) => {
return new Promise((resolve) => {
const testProcess = spawn('npm', ['run', 'test-e2e', '--', `--spec=${spec}`]);
// For debugging on Windows:
// const testProcess = spawn('cmd', ['/c', 'npm run test-e2e -- --spec=' + spec]);
testProcess.stdout.on('data', (message) => fs.writeFileSync('cypress/logs/log.txt', message, { flag: 'a' }));
testProcess.on('exit', (code) => {
if (code === 0) {
specStats.get(spec).push('pass');
} else {
specStats.get(spec).push('fail');
}
resolve();
});
});
};
const renderResults = () => {
console.log('Results:');
const normalizeName = (name) => name.replace('cypress/e2e/', '').replace('.spec.js', '');
const longestSpecName = Math.max(...specs.map((spec) => normalizeName(spec).length));
// sort by number of failures
const sortedSpecs = specs.sort((a, b) => {
const aFailures = specStats.get(a).filter((stat) => stat === 'fail').length;
const bFailures = specStats.get(b).filter((stat) => stat === 'fail').length;
return bFailures - aFailures;
});
for (const spec of sortedSpecs) {
const stats = specStats.get(spec);
const name = normalizeName(spec);
const padding = ' '.repeat(longestSpecName - name.length);
const passFail = stats.map((stat) => (stat === 'pass' ? '✅' : '❌')).join('');
const failureRate = Math.round((stats.filter((stat) => stat === 'fail').length / (stats.length || 1)) * 100);
console.log(`${name}:${padding} [${String(failureRate).padStart(3, ' ')}%] ${passFail}`);
}
};
(async function runSpecs() {
for (const spec of specs) {
console.log(`Running spec: ${spec}...`);
await runSpec(spec);
console.log(specStats);
renderResults();
}
runSpecs();
})();
Hope this helps someone :)
Hey guys. This could be a great feature now when we have browser mode to detect flaky tests inside playwright env :)