riscv-openocd Add CI for riscv-openocd

Adding an automation that runs riscv-tests/debug on the build of riscv-openocd.

The steps performed are:

Build of OpenOCD
Build of Spike
Download a pre-built RISC-V toolchain
Run riscv-tests/debug
Collect test results (provide a summary + all logs for download and inspection)
Collect OpenOCD code coverage

Jan 03 '21 17:01 JanMatCodasip

Based on an earlier discussion, I have put together this CI automation that builds OpenOCD, obtains other needed dependencies (Spike and the toolchain), then finally executes everything together via riscv-tests/debug and collects the results.

Feedback is welcome as always.

An example of how the CI works is here: https://github.com/JanMatCodasip/riscv-openocd/runs/1627776945

@timsifive - I would especially like to know what toolchain version is supposed to be used with current riscv-tests. I have been experimenting with pre-built RISC-V GCC by Embecosm, both "weekly" and "stable" builds, and encountered different errors on different toolchain versions. There were test failures and exceptions on all toolchain versions I tried.

On toolchains from Nov-29-2020 or older, handful of multi-thread tests throw an exception due to this GDB assert.
- Results: (1), (2), (3)
On newer toolchains, lot of tests fail in the compilation/linking phase due to the selected extensions (-march=xyz). Seems that the "v" extension, if selected, is the one that plays havoc with the toolchain.
- Results:
  - (4): Error: cannot find default versions of the ISA extension 'v'
  - (5): Mis-matched ISA version for 'm' extension. 2.0 vs 2.0, -march=: ISA string must begin with rv32 or rv64
  - (6): Error: cannot find default versions of the ISA extension v'`

Thank you in advance for help with getting this automation to a passing state.

Jan 03 '21 18:01 JanMatCodasip

On toolchains from Nov-29-2020 or older, handful of multi-thread tests throw an exception due to this GDB assert.

I assume that problem is fixed then, for our purposes at least.

On newer toolchains, lot of tests fail in the compilation/linking phase due to the selected extensions (-march=xyz). Seems that the "v" extension, if selected, is the one that plays havoc with the toolchain.

Since 'v' hasn't been ratified yet, I think it makes sense to change the tests to detect whether 'v' is supported by the compiler and, if not, then not to use it. I'll make that change.

FWIW, for gdb I'm using 85f783647061e58968ecdc516137d8df9f2d5e16 from git://sourceware.org/git/binutils-gdb.git.

For the compiler I think I'm using 256a4108922f76403a63d6567501c479971d5575 from https://github.com/riscv/riscv-gnu-toolchain.git.

Jan 05 '21 21:01 timsifive

Thank you for taking a moment to review this draft.

On toolchains from Nov-29-2020 or older, handful of multi-thread tests throw an exception due to this GDB assert.

I assume that problem is fixed then, for our purposes at least.

It seems that with newer toolchains, the gdb assert (bug report thread) is still there. The tests just fail much earlier - in the compilation phase due to the 'v' extensions, presumably. I will take another look at this.

Since 'v' hasn't been ratified yet, I think it makes sense to change the tests to detect whether 'v' is supported by the compiler and, if not, then not to use it. I'll make that change.

Thanks, that would be good to have the control over the vector tests, whether to run them or not. Either auto-detect it by compiler capabilities, or have a command-line switch for the vector tests - whichever is more convenient.

FWIW, for gdb I'm using 85f783647061e58968ecdc516137d8df9f2d5e16 from git://sourceware.org/git/binutils-gdb.git.

For the compiler I think I'm using 256a4108922f76403a63d6567501c479971d5575 from https://github.com/riscv/riscv-gnu-toolchain.git.

I'll look at these versions and compare them w.r.t. those that I tried.

Jan 06 '21 06:01 JanMatCodasip

You can speed up git clone by adding --depth=1. We don't need history in these repos.

Good point, I'll add that.

Jan 06 '21 06:01 JanMatCodasip

Did something just change in the embecosm binary? I just downloaded them from https://buildbot.embecosm.com/job/riscv32-gcc-ubuntu1804/25/artifact/riscv32-embecosm-gcc-ubuntu1804-20201108.tar.gz, and I get no errors about the vector extension:

tnewsome@compy-linux:~/SiFive/riscv-tools/riscv-tests/debug$ ./gdbserver.py --gcc /home/tnewsome/SiFive/riscv32-embecosm-gcc-ubuntu1804-20201108/bin/riscv32-unknown-elf-gcc ./targets/RISC-V/spike32.py CheckMisa
Using $misa from hart definition: 0x4034112d
[CheckMisa] Starting > logs/20210106-103756-spike32-CheckMisa.log
[CheckMisa] pass in 1.45s
::::::::::::::::::::::::::::[ ran 1 tests in 1s ]:::::::::::::::::::::::::::::
1 tests returned pass

Jan 06 '21 18:01 timsifive

Did something just change in the embecosm binary? I just downloaded them from https://buildbot.embecosm.com/job/riscv32-gcc-ubuntu1804/25/artifact/riscv32-embecosm-gcc-ubuntu1804-20201108.tar.gz, and I get no errors about the vector extension

On builds 20201129 and older, I have not encountered the issues with vector extension. In those versions, I only saw that gdb assert, which now appears to be close to a fix. I'll have a moment to re-test the patch mentioned in that thread in a day or two.

The compiler errors due to V extension only started to appear on newer builds, for example 20201220, 20210103 or 10.2.0-r4 (Nov 8). The links lead to the CI runs from which the logs with the compiler errors can be downloaded (under the "artifacts" link).

Jan 07 '21 06:01 JanMatCodasip

Test change that handles compilers that don't support V just merged.

Jan 08 '21 21:01 timsifive

Test change that handles compilers that don't support V just merged.

Thank you. I have just re-run the CI with toolchain build 20210103. The results have improved consderably - no failed test anymore and 24 exceptions. These exceptions all appear to have the same cause - the known gdb assert, mentioned earlier: internal-error: int finish_step_over(execution_control_state*): Assertion 'ecs->event_thread->control.trap_expected' failed.

I will be able to return to this later this week (Thu/Fri) and move it forward. I'll re-test the existing GDB patch for the assert and will also add the suggestions from this thread (e.g. clone depth=1).

Jan 11 '21 05:01 JanMatCodasip

I have re-run the CI with the latest GDB code (HEAD) which is supposed to fix the GDB control.trap_expected assert. It however seems to trigger a different assert now, so the issue appears not to be fully resolved yet.

I will take a closer look at this and will update the original gdb bug thread. I'll post further update here when I have more information.

Jan 15 '21 06:01 JanMatCodasip

After applying the latest patch for the asserts to GDB, I am no longer getting any assertions.

Results (executed locally, not via Github Actions):

INFO:root: +================+
INFO:root: |  Failed tests  |
INFO:root: +================+
INFO:root:
INFO:root:20210118-105208-spike32_2-StepTest
INFO:root:20210118-105948-spike64_2-InstantChangePc
INFO:root:20210118-110102-spike64_2-StepTest
INFO:root:
INFO:root: +==============================+
INFO:root: |  Tests ended with exception  |
INFO:root: +==============================+
INFO:root:
INFO:root:(none)
INFO:root:
INFO:root: +===========+
INFO:root: |  Summary  |
INFO:root: +===========+
INFO:root:
INFO:root:Target                    # tests    Pass       Not_appl.  Fail       Exception 
INFO:root:-----                     -----      -----      -----      -----      -----     
INFO:root:spike32                   60         49         11         0          0         
INFO:root:spike32_2                 120        104        15         1          0         
INFO:root:spike64                   60         52         8          0          0         
INFO:root:spike64_2                 120        103        15         2          0         
INFO:root:-----                     -----      -----      -----      -----      -----     
INFO:root:All targets:              360        308        49         3          0         
INFO:root:-----                     -----      -----      -----      -----      -----

failed_cases_20210118.zip

So the GDB assertion issue is hopefully close to a resolution.

Jan 18 '21 10:01 JanMatCodasip

That's good progress. Once there's some set of gdb source I can just check out and build I can take a look at these remaining failures. It looks like the patch you linked hasn't merged anywhere yet.

Jan 19 '21 19:01 timsifive

The patch for the GDB assertion issue has been merged, and the patched GDB can now be built manually from the binutils-gdb/master.

The binary version of the toolchain with this patch will be available next week (via buildbot.embecosm.com) and once it is there, I will update this pull request.

When running riscv-tests/debug locally with the updated GDB, these are the results I got:

one failed test (20210226-091153-spike32_2-InstantChangePc)
one GDB assert (20210226-091221-spike32_2-ProgramSwWatchpoint)
everything else passes

INFO:root: +================+
INFO:root: |  Failed tests  |
INFO:root: +================+
INFO:root:
INFO:root:20210226-091153-spike32_2-InstantChangePc
INFO:root:
INFO:root: +==============================+
INFO:root: |  Tests ended with exception  |
INFO:root: +==============================+
INFO:root:
INFO:root:20210226-091221-spike32_2-ProgramSwWatchpoint
INFO:root:
INFO:root: +===========+
INFO:root: |  Summary  |
INFO:root: +===========+
INFO:root:
INFO:root:Target                    # tests    Pass       Not_appl.  Fail       Exception 
INFO:root:-----                     -----      -----      -----      -----      -----     
INFO:root:spike32                   60         49         11         0          0         
INFO:root:spike32_2                 120        103        15         1          1         
INFO:root:spike64                   60         52         8          0          0         
INFO:root:spike64_2                 120        105        15         0          0         
INFO:root:-----                     -----      -----      -----      -----      -----     
INFO:root:All targets:              360        309        49         1          1         
INFO:root:-----                     -----      -----      -----      -----      -----

Logs from the tests: openocd-ci-logs-2021-02-26.tar.gz

I haven't yet analyzed the two failed tests but will take a closer look at the last remaining GDB assertion.

Feb 26 '21 09:02 JanMatCodasip

The failure looks to be related to how gdb/OpenOCD are interpreting the result of a stepi. I assume OpenOCD performs it correctly, but whatever it reports does not make gdb realize that all harts have halted at the end of the step. (Log shows gdb thinks SIGTRAP happened.)

Feb 27 '21 01:02 timsifive

It is becoming clear that the top-of-tree builds of GDB from buildbot.embecosm.com are at this time not a good stable point to test OpenOCD against. The assert issue for software watchpoints is still present and there are at least 7 more tests that fail intermittently on the build from 20210309 (spike32_2-DebugTurbostep, spike32_2-InstantChangePc, spike32_2-StepTest, spike64_2-DebugTurbostep, spike64_2-InstantChangePc, spike64_2-StepTest, spike64_2-Sv48Test). Even if these are resolved, there is no guarantee of stability on those top-of-tree builds.

For that reason, I am switching the toolchain to xPack GCC build v10.1.0-1.1 which corresponds to Freedom Tools release v2020.08.0. On this toolchain, I am getting intermittent failures only on VectorTest in approx 50% of the test runs. It seems related to which thread gets selected in the tests:

In future, I would like to switch to a stable release-grade toolchain from the upstream, once such build exists.

That said, it still makes sense to me to test OpenOCD against the top-of-tree (unstable) toolchain build once in a while to learn about possible issues early, but not as part of OpenOCD CI.

Mar 15 '21 20:03 JanMatCodasip

In future, I would like to switch to a stable release-grade toolchain from the upstream, once such build exists.

Would you consider using a crosstool-NG build of the tools from upstream repositories?

From: Jan Matyas @.> Sent: Monday, March 15, 2021 8:55:44 PM To: riscv/riscv-openocd @.> Cc: Tommy Murphy @.>; Mention @.> Subject: Re: [riscv/riscv-openocd] Add CI for riscv-openocd (#563)

It is becoming clear that the top-of-tree builds of GDB from buildbot.embecosm.com are at this time not a good stable point to test OpenOCD against. The assert issue for software watchpoints is still present and there are at least 7 more tests that fail intermittently on the build from 20210309 (spike32_2-DebugTurbostep, spike32_2-InstantChangePc, spike32_2-StepTest, spike64_2-DebugTurbostep, spike64_2-InstantChangePc, spike64_2-StepTest, spike64_2-Sv48Test). Even if these are resolved, there is no guarantee of stability on those top-of-tree builds.

For that reason, I am switching the toolchain to xPack GCC build v10.1.0-1.1https://github.com/xpack-dev-tools/riscv-none-embed-gcc-xpack/releases which corresponds to Freedom Tools release v2020.08.0. On this toolchain, I am getting intermittent failures only on VectorTest in approx 50% of the test runs. It seems related to which thread gets selected in the tests:

Pass: 20210314-130410-spike32_2-VectorTest.loghttps://github.com/riscv/riscv-openocd/files/6144375/20210314-130410-spike32_2-VectorTest.log
Fail: 20210314-124949-spike32_2-VectorTest.loghttps://github.com/riscv/riscv-openocd/files/6144364/20210314-124949-spike32_2-VectorTest.log

In future, I would like to switch to a stable release-grade toolchain from the upstream, once such build exists.

That said, it still makes sense to me to test OpenOCD against the top-of-tree (unstable) toolchain build once in a while to learn about possible issues early, but not as part of OpenOCD CI.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/riscv/riscv-openocd/pull/563#issuecomment-799745726, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADPAMTCJ4D2ATV23L7PAKXDTDZX5BANCNFSM4VSBXL3A.

Mar 15 '21 22:03 TommyMurphyTM1234

Would you consider using a crosstool-NG build of the tools from upstream repositories?

That is a possibility, however that would make the per-commit checks run too long. For that reason, I prefer a binary build of the toolchain (Xpack, buildbot.embecosm.com or similar), as long as it stable and working reliably with bare-metal RISC-V targets via gdbserver (OpenOCD in this case).

Mar 16 '21 07:03 JanMatCodasip

I mean build the tools once (or whenever there are "significant" changes to the upstream repos - gcc, gdb/binutils, newlib etc.) and then use those toolchain binaries. Not doing a fill CT-NG build of the tools every time.

From: Jan Matyas @.> Sent: Tuesday 16 March 2021 07:27 To: riscv/riscv-openocd @.> Cc: Tommy Murphy @.>; Mention @.> Subject: Re: [riscv/riscv-openocd] Add CI for riscv-openocd (#563)

Would you consider using a crosstool-NG build of the tools from upstream repositories?

That is a possibility, however that would make the per-commit checks run too long. For that reason, I prefer a binary build of the toolchain (Xpack, buildbot.embecosm.com or similar), as long as it stable and working reliably with bare-metal RISC-V targets via gdbserver (OpenOCD in this case).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/riscv/riscv-openocd/pull/563#issuecomment-800022796, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADPAMTFW37PIGCX7ETDUKGDTD4B5JANCNFSM4VSBXL3A.

Mar 16 '21 09:03 TommyMurphyTM1234

I mean build the tools once (or whenever there are "significant" changes to the upstream repos - gcc, gdb/binutils, newlib etc.) and then use those toolchain binaries. Not doing a fill CT-NG build of the tools every time.

That is possible, as long as

the build is hosted/cached somewhere
the build is reproducible (so that contributors can build it locally to test against, if needed)
someone volunteers the time to do that

I still expect (and hope) the needs of the OpenOCD testing can be satisfied by a reputable third party toolchain build.

Mar 16 '21 09:03 JanMatCodasip

Ok - no worries. Just a suggestion. As you say, any reputable third party toolchain that is regularly maintained updated should do and I know that Liviu's xPack Project version fits the bill.

From: Jan Matyas @.> Sent: Tuesday 16 March 2021 09:35 To: riscv/riscv-openocd @.> Cc: Tommy Murphy @.>; Mention @.> Subject: Re: [riscv/riscv-openocd] Add CI for riscv-openocd (#563)

I mean build the tools once (or whenever there are "significant" changes to the upstream repos - gcc, gdb/binutils, newlib etc.) and then use those toolchain binaries. Not doing a fill CT-NG build of the tools every time.

That is possible, as long as

the builds is hosted/cached somewhere
the build is reproducible (so that contributors build it locally to test against, if needed)
someone volunteers the time to do that

I still expect (and hope) the needs of the OpenOCD testing can be satisfied by a reputable third party toolchain build.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/riscv/riscv-openocd/pull/563#issuecomment-800103738, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADPAMTHPLTDOH4KRTGWAPJDTD4Q7NANCNFSM4VSBXL3A.

Mar 16 '21 10:03 TommyMurphyTM1234

@timsifive - I thought about picking up this dormant pull request but it seems that many of the tests now fail due to timeouts. An example can be seen here: https://github.com/JanMatCodasip/riscv-openocd/runs/2727722407#step:8:6917

The timeouts occur both on my local machine and in the Github CI and seem to be completely non-deterministic.

Common symtoms are:

OpenOCD-to-GDB timeout warnings: keep_alive() was not invoked in the 1000 ms timelimit...
Pexpect timeout in the test suite itself: pexpect.exceptions.TIMEOUT: Timeout exceeded.

Tim, is this something you are able to reproduce locally, too? What changed in OpenOCD or Spike that make the timeouts start to appear? Thanks.

Jun 02 '21 14:06 JanMatCodasip

Yes, spike-multi gets a bunch of timeouts. I am looking at that exact problem right now.

Jun 02 '21 21:06 timsifive

Between https://github.com/riscv/riscv-openocd/pull/616 and https://github.com/riscv/riscv-tests/pull/339 the timeout problem should be resolved.

Jun 03 '21 22:06 timsifive

Between #616 and riscv/riscv-tests#339 the timeout problem should be resolved.

Thanks. Both these changes appear fine. The tests now pass within the Gitlab CI: https://github.com/JanMatCodasip/riscv-openocd/runs/2771597698

I still do get some failures locally (different than the timeouts) but cannot draw any conclusions yet. The fails can be due to local setup problems which I still need to look into.

Another concern I have are intermittent failures. As an experiment, I'm now running the whole test set repeatedly to try to pinpoint which tests do intermittently fail.

Jun 08 '21 10:06 JanMatCodasip

On my machine only Sv48Test fails intermittently.

Jun 08 '21 18:06 timsifive

I'm now running the whole test set repeatedly to try to pinpoint which tests do intermittently fail.

Yesterday's CI run timed out before reaching the 10 cycles, however there are some intermittent fails already:

Fails of: spike64_2-Sv48Test, spike64_2-Sv39Test
Exceptions: spike64_2-DownloadTest (2x)

Today's CI run has the timeout extended and all the 10 cycles should complete.

Jun 09 '21 10:06 JanMatCodasip

Today's CI run has the timeout extended and all the 10 cycles should complete.

So it seems I cannot specify longer timeout than 6 hrs for one CI step. Despite that, there were 9 full test cycles executed, with two tests failing intermittently:

spike64_2-Sv48Test - failed three times
spike64_2-Sv39Test - failed once

I've just started one last round of multiple CI runs. The aim is to stress-test it more and have a higher likelihood of catching other intermittent fails, if there are any.

Jun 10 '21 06:06 JanMatCodasip

The CI runs have finished now. There were 30 full test cycles executed in total (riscv-tests/debug, make all), split to 6 CI runs: 1, 2, 3, 4, 5, 6.

Intermittent exceptions:

spike64_2-DownloadTest (6 out of 30 runs)

Intermittent fails:

spike64_2-Sv39Test (6 out of 30 runs)
spike64_2-Sv48Test (8 out of 30 runs)

The rest passed or is "not applicable".

@timsifive - Tim, if you have a moment to look at the failed cases, that would bring this testing automation closer to being ready. Thank you.

Jun 10 '21 09:06 JanMatCodasip

Looking at Sv48 logs, when it fails it's because we're running the test on thread 2 (hart 1), and OpenOCD reports to gdb that thread 1 (hart 0) spontaneously halts. Presumably this happens because the harts are in the same halt group, and there's a race between the harts halting and poll() discovering that for each hart.

Ideally we'd fix this in OpenOCD by having it not change the "current hart" (which is a meaningless concept in hardware but we have to maintain because gdb thinks it's debugging a single-threaded OS with multiple threads).

Jun 11 '21 22:06 timsifive

@timsifive - Hi, I thought about reviving this merge request for the CI. After rebasing and re-running the tests on current code, I got multiple failures on Sv## tests. More failures that in June (which is the last time I tried that).

Results: https://github.com/JanMatCodasip/riscv-openocd/runs/4114461836?check_suite_focus=true#step:9:409

How stable are the tests these days? Are these known issues?

Thanks.

Nov 05 '21 08:11 JanMatCodasip

Another report with accumulated results from 5x test runs: https://github.com/JanMatCodasip/riscv-openocd/runs/4114445337?check_suite_focus=true#step:9:1885

The pattern of failures/exceptions is the same as above.

Nov 05 '21 09:11 JanMatCodasip