playwright icon indicating copy to clipboard operation
playwright copied to clipboard

[Feature]: No option to keep all traces from a failing test

Open huehnerlady opened this issue 11 months ago • 1 comments

Version

1.50.0

Steps to reproduce

create a playwright config:

import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
  // Look for test files in the "tests" directory, relative to this configuration file.
  testDir: 'e2e-tests',
  use: {
    // Base URL to use in actions like `await page.goto('/')`.
    baseURL: 'http://localhost:4200/',

    // Collect trace when retrying the failed test.
    // trace: 'on-first-retry',

    // Collect all traces when retrying the failed test. Enable this if there is a flaky test.
    // This means the tests run slower all together, so it should only be enabled temporarily
    trace: 'retain-on-failure',
  },
});

inspect the trace property

Expected behavior

I expect to have a setting for keeping all traces of a test once one of the runs fails

Actual behavior

I have only these options

'off': Do not record trace.
'on': Record trace for each test.
'on-first-retry': Record trace only when retrying a test for the first time.
'on-all-retries': Record trace only when retrying a test.
'retain-on-failure': Record trace for each test. When test run passes, remove the recorded trace.
'retain-on-first-failure': Record trace for the first run of each test, but not for retries. When test run passes, remove the recorded trace.

This means you can either have ALL traces, you can miss the first record, but get the retries or you get all the failing records.

Additional context

Hi,

In our playwright config we have the setting trace: 'on-first-retry', This means we only trace the retries, not the initial test, but has the best balance between performance and error tracing.. When we encounter a flaky test, we change it to trace: 'retain-on-failure',.

Our problem with this setting is that we seem to only keep the traces from the tries that fail. So when I have a flaky test and the first try fails but the second try passes, it only retains the first try, but not the second one. For debugging purposes we would like to have a setting that retains ALL traces of a test as soon as one run fails.

For me this seems a bug as this is a hole in the test config.

Environment

System:
    OS: macOS 15.3
    CPU: (12) arm64 Apple M2 Max
    Memory: 566.09 MB / 32.00 GB
  Binaries:
    Node: 20.14.0 - ~/.nvm/versions/node/v20.14.0/bin/node
    Yarn: 1.22.19 - /opt/homebrew/bin/yarn
    npm: 10.8.2 - ~/.nvm/versions/node/v20.14.0/bin/npm
  IDEs:
    VSCode: 1.96.4 - /usr/local/bin/code
  Languages:
    Bash: 3.2.57 - /bin/bash
  npmPackages:
    @playwright/test: ^1.50.0 => 1.50.0 
    playwright-ctrf-json-reporter: ^0.0.18 => 0.0.18

huehnerlady avatar Feb 03 '25 06:02 huehnerlady

I was pretty surprised by this omission as well.

Alternatively, I would at least like a combination of "retain-on-first-failure" and "retain-on-retry-success" (the latter of which doesn't exist either!). So if a test is configured to retry, I can get the traces of initial failure and the success (if it happens). That way I can compare what differed between the failure and when it succeeded.

The OP request of just retaining all if there was a failure is simpler, but the above would cover my need as well.

nicolas-martin-dte avatar May 13 '25 20:05 nicolas-martin-dte

Adding my vote for this one. Having a retain-all-flaky option to save both the failing and passing runs against the same infrastructure is extremely helpful when trying to improve the stability of our tests. For example, we've encountered tests that "should" have been independent but ended up having a hidden dependency, causing it to pass where it shouldn't have; having both runs to compare makes debugging much quicker.

We tried to retain all traces, but the majority of our tests pass just fine and the volume of data generated was too large to be manageable.

Dan-DeAraujo avatar Jul 28 '25 19:07 Dan-DeAraujo

Voting for this one as well. A scenario where a test only sometimes fails is hard enough to debug, without keeping all of the traces when this happens means you have insufficient resources to investigate.

patrickmullen-b2d avatar Oct 23 '25 21:10 patrickmullen-b2d