cva6 icon indicating copy to clipboard operation
cva6 copied to clipboard

Fail Github CI pipeline on test failure

Open WorldofJARcraft opened this issue 10 months ago • 6 comments

On a local clone of the repository, I ran the dv-riscv-arch-test.sh script and experienced a test failure for the riscv-test-suite/rv64i_m/B/src/zext.h_64-01.S test in verilator. I first assumed something was set up incorrectly on my end, but then I checked the CI logs and found that the same test also fails in the latest execute-riscv-tests (dv-riscv-arch-test, veri-testharness) run. However, the test script always exits with code 0, which is why the pipeline succeeds anyways. This pull request fails the pipeline in case tests error out.

WorldofJARcraft avatar Apr 09 '24 08:04 WorldofJARcraft

An issue for the underlying bug already exists: https://github.com/openhwgroup/cva6/issues/1758

WorldofJARcraft avatar Apr 09 '24 08:04 WorldofJARcraft

:heavy_check_mark: successful run, report available here.

github-actions[bot] avatar Apr 09 '24 09:04 github-actions[bot]

Hello @WorldofJARcraft

Thanks for this contribution

  1. On GitLab we generate a yaml report to build HTML pages with the status. On GitHub I don’t know what we do. Cc @MarioOpenHWGroup
  2. Here you want to make the script fail so that the job fails. We have to check the impact on GitLab CI jobs too, to avoid a systematic 0/0 with no details when one test fails. A change in .gitlab-ci.yml could be needed. Cc @valentinThomazic
  3. Is it relevant to run all tests even if the first one failed ("no fail fast" mode)? What do you all think about set -e to stop after the first fail ("fail fast")? It would introduce way less diffs in this PR (only 1 line added in each file).

cathales avatar Apr 09 '24 09:04 cathales

Hello @WorldofJARcraft

Thanks for this contribution

1. On GitLab we generate a yaml report to build HTML pages with the status. On GitHub I don’t know what we do. Cc @MarioOpenHWGroup

2. Here you want to make the script fail so that the job fails. We have to check the impact on GitLab CI jobs too, to avoid a systematic 0/0 with no details when one test fails. A change in `.gitlab-ci.yml` could be needed. Cc @valentinThomazic

3. Is it relevant to run all tests even if the first one failed ("no fail fast" mode)? What do you all think about `set -e` to stop after the first fail ("fail fast")? It would introduce way less diffs in this PR (only 1 line added in each file).

Thanks for your reply! I think set -e as you suggested would work. All that matters is that a non-zero exit code is returned to the CI when one or more tests fail. As to whether the remaining tests should be executed after a test failed, I am not sure what the best course of action is. I will leave this decision to the maintainers.

WorldofJARcraft avatar Apr 09 '24 10:04 WorldofJARcraft

Hello @WorldofJARcraft

After an internal discussion, we need the no-fail-fast mode.

The verif/regress/*.sh scripts are meant to be run locally too, so it might be better to not change their behavior. We could have a script which gets the simulation result, just like we do in GitLab, except we would set exit code instead of generating a yaml script. A way to do so could be to just update the existing script, to detect the Dashboard-related environment values.

  • If they are provided (GitLab), generate a yaml report using these values (current behavior).
  • If they are not provided (GitHub), set exit code.

So two changes would be made:

  1. Updating the Python script as described above
  2. Run this Python script after running tests in GitHub CI

Note that the simulation result format will change in the following weeks so it could be better to wait for @valentinThomazic to be back to discuss this.

cathales avatar Apr 11 '24 07:04 cathales

👋 Hi there!

This pull request seems inactive. Need more help or have updates? Feel free to let us know. If there are no updates within the next few days, we'll go ahead and close this PR. 😊

github-actions[bot] avatar May 12 '24 01:05 github-actions[bot]