vitest icon indicating copy to clipboard operation
vitest copied to clipboard

Feature request: loop test run until failure for watch mode

Open capaj opened this issue 3 years ago • 11 comments

Clear and concise description of the problem

sometime we have a flaky test in the suite. We think we have made a change which could fix the flaky spec, but how do we verify this? You need to run the test like 100x in a row or something. With the current watch mode that would entail hitting enter key manually 100x.

Suggested solution

add run until failure mode to watch usage screen: image

then you can just enable this, mode grab a :coffee: and come back in a few minutes refreshed to see if the flakyness is fixed or not

Alternative

manually hitting enter key 100x

Additional context

No response

Validations

capaj avatar Sep 14 '22 12:09 capaj

@sheremet-va if I were to open a PR for this, would it get considered or is this feature something vitest does not even want to include?

capaj avatar Jan 05 '23 10:01 capaj

@sheremet-va if I were to open a PR for this, would it get considered or is this feature something vitest does not even want to include?

Feel free to add this feature 👍🏻

sheremet-va avatar Jan 05 '23 16:01 sheremet-va

@sheremet-va I could work on this if it is still desired. But to fully understand this, basically, we want to implement "rerun all tests" but n times, right? So it could be a new option before the quit one named "n" which when pressed prompts the user to enter a number which is the n times the tests will be rerun for.

UPDATE: Should this be for all tests or failed tests only?

obadakhalili avatar Mar 06 '23 16:03 obadakhalili

I would prefer to have also option to keep rerunning until first failure. Specifying a number of reruns is nice to have but I often don't know how many reruns will be needed.

capaj avatar Mar 06 '23 17:03 capaj

Hmm ... So you are suggesting that pressing the option "n" keeps running n times or until failure? This should be the behavior of my suggested option, but not a separate option. Are we on the same page @capaj?

obadakhalili avatar Mar 06 '23 19:03 obadakhalili

Almost on the same page. I guess I can just pass arbitrary high numbers like 10000 for example to make sure it runs long enough.

capaj avatar Mar 06 '23 19:03 capaj

I guess what you are looking for is a different option, say i, that runs forever or until failure. But it has to be monitored because there is a chance that tests will never fail, and just take processing space forever.

obadakhalili avatar Mar 07 '23 03:03 obadakhalili

There is PR for "repeat" mode: https://github.com/vitest-dev/vitest/pull/2652

Maybe once it's merged you can build your solution on top of it.

sheremet-va avatar Mar 07 '23 08:03 sheremet-va

@sheremet-va I would love that. I'll subscribe for changes for when it is merged, and also feel free to ping me to work on this once the repeat mode PR is merged because I might forget. Meanwhile, got anything for me to work on?

obadakhalili avatar Mar 08 '23 18:03 obadakhalili

I would also use this functionality for fixing flaky tests.

The Ginkgo test framework for Golang has a flag called --until-it-fails which runs forever until a test fails.

jacknewberry avatar Mar 22 '23 13:03 jacknewberry

I use this bash script on Ubuntu WSL2 for running cypress tests until failure, the same can be used for vitest:

#!/bin/bash

# use this script to run tests until failure

# start testing until failure
while true
do
  # replace --spec with your desired spec
  npm run test-e2e -- --spec="cypress/e2e/auth/login.spec.js"

  # or to run all tests, use:
  # npm run test-e2e

  if [ $? -ne 0 ]; then
    echo "Error found!"
    # test failed, exit process, stop testing
    exit 1
  fi
  # test didn't fail - rerun the test again
done

Maxim-Mazurok avatar Aug 29 '23 00:08 Maxim-Mazurok

Is there any update on this one? This would be a really useful feature to have within Vitest and the UI, instead of relying on some sort of script. Thanks

JORDAAAN1495 avatar Mar 17 '25 18:03 JORDAAAN1495

I've been dealing a lot with flaky tests, here are some things to consider:

  • To verify if test is fixed, I have "run until failure" script, which will run test until it fails
  • To verify if flakiness is reduced/increased - I have "failure rate" script, which will continuously run the same test, tracking failure rate
  • To find the most flaky tests - I have "failure rates" script, it will continuously run all tests and show failure rate for each
  • It's also fairly important to keep logs for all runs, but especially failures, so that it can be analysed later on

Here's my "failure rate" bash script:

#!/bin/bash

# use this script to figure out failure rate of a test

# keep count of how many times the tests have been run
count=0

# keep count of failures
failures=0

# start testing until failure
while true
do
    # replace --spec with your desired spec

    # show output:
    # npm run test-e2e -- --spec="cypress/e2e/some.spec.js"

    # hide output:
    # npm run test-e2e -- --spec="cypress/e2e/some.spec.js" > /dev/null 2>&1

    # append output to log.txt:
    npm run test-e2e -- --spec="cypress/e2e/compliance/some.spec.js" >> cypress/logs/log.txt 2>&1 >/dev/null

    # run multiple tests:
    # npm run test-e2e -- --spec="cypress/e2e/some1.spec.js" > /dev/null 2>&1 && \
    # npm run test-e2e -- --spec="cypress/e2e/some2.spec.js" > /dev/null 2>&1 && \
    # npm run test-e2e -- --spec="cypress/e2e/some3.spec.js" > /dev/null 2>&1 && \
    # npm run test-e2e -- --spec="cypress/e2e/some4.spec.js" > /dev/null 2>&1 && \
    # npm run test-e2e -- --spec="cypress/e2e/some5.spec.js" > /dev/null 2>&1 && \
    # npm run test-e2e -- --spec="cypress/e2e/some6.spec.js" > /dev/null 2>&1

    # or to run all tests, use:
    # npm run test-e2e

    if [ $? -ne 0 ]; then
        # test failed, increment the failure count
        failures=$((failures+1))
    fi

    # increment the count
    count=$((count+1))

    # print failure rate
    echo "Failure rate: $failures/$count = $(echo "scale=2; $failures * 100 / $count" | bc | awk '{printf "%d\n", ($0 < 0 ? $0-0.5 : $0+0.5)}')%"
done

And here's my "failure rates" CJS script, it's for cypress, but could be adapted for vitest:

const { getSpecs } = require('find-cypress-specs');
const { spawn } = require('child_process');
const fs = require('fs');

const specs = getSpecs();

const specStats = new Map(specs.map((spec) => [spec, []]));

fs.writeFileSync('cypress/logs/log.txt', `\nNew run, at ${new Date().toISOString()}\n`, { flag: 'a' });

const runSpec = (spec) => {
    return new Promise((resolve) => {
        const testProcess = spawn('npm', ['run', 'test-e2e', '--', `--spec=${spec}`]);
        // For debugging on Windows:
        // const testProcess = spawn('cmd', ['/c', 'npm run test-e2e -- --spec=' + spec]);

        testProcess.stdout.on('data', (message) => fs.writeFileSync('cypress/logs/log.txt', message, { flag: 'a' }));
        testProcess.on('exit', (code) => {
            if (code === 0) {
                specStats.get(spec).push('pass');
            } else {
                specStats.get(spec).push('fail');
            }
            resolve();
        });
    });
};

const renderResults = () => {
    console.log('Results:');
    const normalizeName = (name) => name.replace('cypress/e2e/', '').replace('.spec.js', '');
    const longestSpecName = Math.max(...specs.map((spec) => normalizeName(spec).length));
    // sort by number of failures
    const sortedSpecs = specs.sort((a, b) => {
        const aFailures = specStats.get(a).filter((stat) => stat === 'fail').length;
        const bFailures = specStats.get(b).filter((stat) => stat === 'fail').length;
        return bFailures - aFailures;
    });
    for (const spec of sortedSpecs) {
        const stats = specStats.get(spec);
        const name = normalizeName(spec);
        const padding = ' '.repeat(longestSpecName - name.length);
        const passFail = stats.map((stat) => (stat === 'pass' ? '✅' : '❌')).join('');
        const failureRate = Math.round((stats.filter((stat) => stat === 'fail').length / (stats.length || 1)) * 100);
        console.log(`${name}:${padding} [${String(failureRate).padStart(3, ' ')}%] ${passFail}`);
    }
};

(async function runSpecs() {
    for (const spec of specs) {
        console.log(`Running spec: ${spec}...`);
        await runSpec(spec);
        console.log(specStats);
        renderResults();
    }
    runSpecs();
})();

Hope this helps someone :)

Maxim-Mazurok avatar Mar 18 '25 04:03 Maxim-Mazurok

Hey guys. This could be a great feature now when we have browser mode to detect flaky tests inside playwright env :)

shfx avatar Nov 20 '25 18:11 shfx