dd-trace-java icon indicating copy to clipboard operation
dd-trace-java copied to clipboard

[POC] Flaky test retries

Open smola opened this issue 11 months ago • 3 comments

What Does This Do

  • Use Develocity Test Retry (docs) to retry flaky tests in CI.
  • Only tests marked as @Flaky will be retried.
  • Failed runs are visible in the logs, and preserved. This keeps observability of intermediate runs in Datadog Test Optimization.

Why?

This would allow to get closer to CI pipelines always succeeding on the first run. Our past strategy seggregating flaky test to a special CI job that never fails the pipeline has led to many tests regressing from flaky to broken.

While retrying can make some individual jobs slower, if we account for the overall time between pushing and manually retrying, the overall time spent in CI should be way lower.

Compared to the old @Retry approach, this plugin has the advantage of not hiding intermediate failed runs. So we should still be able to monitor flaky tests. in fact, given that flaky tests are always retried at least once, flaky test detection should be more accurate, both at Circle CI and Datadog Test Optimization.

Caveats

There are some limitations of this plugin:

  • @Flaky annotation only works at class level, not method level (feature request).
  • It does not support advanced runtime conditions, such as filtering by JVM vendor or version, or limiting retyring to specific errors (feature request).

Contributor Checklist

Jira ticket: [PROJ-IDENT]

smola avatar Dec 18 '24 22:12 smola