incubator-seata icon indicating copy to clipboard operation
incubator-seata copied to clipboard

optimize: Introduce automated flaky test tracking like OpenSearch

Open OmCheeLin opened this issue 5 months ago • 16 comments

  • [x] I have registered the PR changes.

Ⅰ. Describe what this PR did

  1. Automatically trigger detect-flaky-test.yml after the "build" fails and the "Rerun build" succeeds.

  2. Download the test reports from the first and second builds:

    • First build report: run-1-surefire-reports-${{ matrix.java }}
    • Second build report: run-2-surefire-reports-${{ matrix.java }}
  3. Run the Python script parse_failed_tests.py:

    • Compare the test reports from the first and second builds.
    • Identify tests that failed in the first run but passed in the second (i.e., flaky tests).
    • Output the results as a JSON list and pass them to the next steps.
  4. If flaky tests are found, automatically create an issue listing the unstable test names (format: ClassName.testMethod).

Ⅱ. Does this pull request fix one issue?

fixes #7448

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

OmCheeLin avatar Jul 17 '25 04:07 OmCheeLin

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 60.65%. Comparing base (d78267a) to head (3eb316c). :warning: Report is 8 commits behind head on 2.x.

Additional details and impacted files
@@             Coverage Diff              @@
##                2.x    #7545      +/-   ##
============================================
+ Coverage     60.63%   60.65%   +0.01%     
  Complexity      658      658              
============================================
  Files          1308     1308              
  Lines         49446    49446              
  Branches       5811     5811              
============================================
+ Hits          29983    29992       +9     
+ Misses        16801    16796       -5     
+ Partials       2662     2658       -4     

see 5 files with indirect coverage changes

Impacted file tree graph

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar Jul 17 '25 04:07 codecov[bot]

@OmCheeLin

Could you please explain how it works? Also, it would be great if you could show an example using your own forked repository. I'll give you feedback after reviewing the example.

YongGoose avatar Jul 17 '25 06:07 YongGoose

@OmCheeLin

Could you please explain how it works? Also, it would be great if you could show an example using your own forked repository. I'll give you feedback after reviewing the example.

ok, I will do it later, This pr is temporarily closed.

OmCheeLin avatar Jul 17 '25 07:07 OmCheeLin

  1. To speed up the CI test process, I made a slight modification to build.yml so that it only runs tests under seata-common.
  2. I added a FlakyTest under seata-common, which will fail on the first build and succeed on the second.
  3. Note: I modified the workflow directly on the 2.x branch, because if it's changed on other branches, some workflow files won't use the latest version during actual CI runs.
  4. On the Actions page, after the build runs twice, detect-flaky-test will be triggered automatically.
image image image

@YongGoose this is my fork repo, see 2.x branch https://github.com/OmCheeLin/incubator-seata

OmCheeLin avatar Jul 18 '25 03:07 OmCheeLin

image

click here to see file changes @YongGoose

OmCheeLin avatar Jul 18 '25 03:07 OmCheeLin

@OmCheeLin

It would be great if we could see a bit more information in the issue.

  • https://github.com/opensearch-project/OpenSearch/issues/14308

YongGoose avatar Jul 18 '25 04:07 YongGoose

In a workflow, what types of runs are retried when they fail? Also, does the workflow automatically retry if it fails?

Additionally, I think it would be nice to have a label for the issue. Would you be able to suggest a name for the label? I’ll take care of creating it myself.

YongGoose avatar Jul 18 '25 04:07 YongGoose

Also, would it be possible to share this PR on DingTalk? I believe this feature could be very useful, so it would be great to get feedback from more developers.

YongGoose avatar Jul 18 '25 04:07 YongGoose

@OmCheeLin

It would be great if we could see a bit more information in the issue.

Given flaky tests, I want to know how to find in which PRs these flaky tests occurred, using a web crawler?

OmCheeLin avatar Jul 18 '25 05:07 OmCheeLin

Given flaky tests, I want to know how to find in which PRs these flaky tests occurred, using a web crawler?

Instead of the PR, the URL of the action where the issue occurred would also be fine. Would you be able to check what kind of information can be retrieved when creating an issue through github actions?

YongGoose avatar Jul 18 '25 10:07 YongGoose

Given flaky tests, I want to know how to find in which PRs these flaky tests occurred, using a web crawler?

Instead of the PR, the URL of the action where the issue occurred would also be fine. Would you be able to check what kind of information can be retrieved when creating an issue through github actions?

It parses the surefire-reports.xml file, currently with only class names.

OmCheeLin avatar Jul 21 '25 01:07 OmCheeLin

@OmCheeLin

To start with, it would be great if we could just output the class names. We can consider upgrading the information provided through a separate PR later on.

For a smoother review process, it would also be helpful if you could clean up the code and resolve CI failures.

YongGoose avatar Jul 21 '25 12:07 YongGoose

@YongGoose cc

OmCheeLin avatar Jul 23 '25 05:07 OmCheeLin

I only changed changes.md, but the CI failed. The previous commit was still successful. Is there flaky-tests?

OmCheeLin avatar Jul 23 '25 05:07 OmCheeLin

I only changed changes.md, but the CI failed. The previous commit was still successful.

Is there flaky-tests?

I rerun the test Let's see

YongGoose avatar Jul 23 '25 08:07 YongGoose

@OmCheeLin

I’d appreciate it if you could create some sub-issues outlining the planned next steps after the PR gets merged.

YongGoose avatar Jul 31 '25 00:07 YongGoose