cva6 icon indicating copy to clipboard operation
cva6 copied to clipboard

Add in Github's CI a 64-bit configuration of the CVA6 using the HPDcache and restore WB cache test

Open cfuguet opened this issue 1 year ago • 22 comments

This PR adds to the Github's CI a tests with the cv64a6_imafdc_sv39_hpdcache configuration.

This configuration uses 64-bit variant of the CVA6 with "deep" structures related to the data cache:

  • Up to 8 inflight loads to the cache (and thus up to 8 miss requests to the memory)
  • Up to 8 inflight write requests to the memory

It also restores the testing of the WB cache into the CI.

FIXME: In addition, this PR comments out some coverpoints in the AXI agent because of illegal bins that are not compatible with the HPDcache. These coverpoints expect that the cache uses a very small subset of IDs in AXI transactions, however the HPDcache is able to use a larger subset as it can issue multiple inflight requests to the memory (both reads and writes).

I pushed a first proposal for the Github's CI. It runs the following configurations:

cv64a6_imafdc_sv39_hpdcache
cv64a6_imafdc_sv39_wb
cv64a6_imafdc_sv39
cv32a65x

However, the number of tests is limited. The selected tests allow to test arithmetic, memory and branch/jump instructions and a complete but very simple application (hello_world). In addition, the riscv-arch-test suite is executed with the cv64a6_imafdc_sv39_hpdcache and cv32a65x configurations. In the case of 64 bits configurations, I selected the unitary tests that enable the MMU.

The tests for a given configuration are run sequentially but all configurations are run in parallel to reduce the overall latency.

For 64 bits configurations:

riscv-arch-test (only in the cv64a6_imafdc_sv39_hpdcache configuration)
rv64ui-v-add
rv64ui-v-ld
rv64ui-v-sd
rv64ui-v-beq
rv64ui-v-jal
hello_world.c

For 32 bits configurations:

riscv-arch-test
rv32ui-p-add
rv32ui-p-lw
rv32ui-p-sw
rv32ui-p-beq
rv32ui-p-jal
hello_world.c

cfuguet avatar May 15 '24 14:05 cfuguet

Yes, having HPDCache in 64 bit config is a most.

Each time we add a new config to be tested in smok-tests.sh, it more or less 5 minutes of execution time increase. We do not want it. Can you reuse the cv64a6_imafdc_sv39 config, I mean replace the wt into hpd in this config ?

JeanRochCoulon avatar May 15 '24 14:05 JeanRochCoulon

Yes, having HPDCache in 64 bit config is a most.

Each time we add a new config to be tested in smok-tests.sh, it more or less 5 minutes of execution time increase. We do not want it. Can you reuse the cv64a6_imafdc_sv39 config, I mean replace the wt into hpd in this config ?

Yes, I can of course. If no one will be upset because it is expecting something else, I can do it :)

cfuguet avatar May 15 '24 14:05 cfuguet

Could you post a message in MM to ask whether one organization will be impacted ? But I think nobody use it for critical purpose.

JeanRochCoulon avatar May 15 '24 14:05 JeanRochCoulon

:heavy_check_mark: successful run, report available here.

github-actions[bot] avatar May 15 '24 15:05 github-actions[bot]

I posted the question in MM, and I also asked it directly during the CVA6 meeting. @Jbalkind says that he knows some groups using the cv64a6_imafdc_sv39 for their FPGA platforms using CVA6.

Hence, the more politically correct solution will be to add an additional configuration (what it is done by this PR). Cons: add latency to the smoke tests as you said.

Other less politically correct solution would be to add the new configuration and modify the smoke tests to only test the new one. Pros: the original configuration is kept as it is. Cons: the original config is not tested anymore in the Thales CI's.

Finally, another politically correct solution but more technical would be to split the smoke tests into two separate scripts: one for the 32 bits configurations and another for 64 bits configurations. In that case, tests could be run in parallel in Thales CI's, hence reducing the overall latency.

@JeanRochCoulon, @Jbalkind, comments ?

cfuguet avatar May 15 '24 15:05 cfuguet

I would argue for parallelising the run. I don't see any downside to doing so. Compute is basically free

Jbalkind avatar May 15 '24 16:05 Jbalkind

Of course, another solution would be to modify the existing configuration, and put a disclaimer to inform people that new releases of the CVA6 uses the HPDcache for the default 64-bit configuration (cv64a6_imafdc_sv39).

I prefer this solution honestly, but I can understand that some folks would need to modify the configuration locally to use the WT cache if they want to use that one.

cfuguet avatar May 15 '24 16:05 cfuguet

Unless we have agreement that wtdcache is being deprecated, I'm not sure that I like this option a whole lot. You have five years of users who have been using wtdcache by default that you're potentially making a change for.

Jbalkind avatar May 15 '24 16:05 Jbalkind

Yes, I can understand your point, but at the same time, the HPDcache is actually adding functionality and performance transparently. With roughly the same area.

cfuguet avatar May 15 '24 16:05 cfuguet

But well, it is a slightly biased view... :)

cfuguet avatar May 15 '24 16:05 cfuguet

In Thales CI, we will not test two 64bits configurations.

To not impact the users, the best (as proposed previously) could be to create a new 64 bits config with hpdcache (64bits_hpdcache) and replace in smoke-tests the first 64bits config by the 64bits_hpdcache config.

JeanRochCoulon avatar May 15 '24 17:05 JeanRochCoulon

In Thales CI, we will not test two 64bits configurations.

To not impact the users, the best (as proposed previously) could be to create a new 64 bits config with hpdcache (64bits_hpdcache) and replace in smoke-tests the first 64bits config by the 64bits_hpdcache config.

Hypothetically, if we choose this solution:

  • shall we change the default configuration of the top Makefile from cv64a6_imafdc_sv39 to cv64a6_imafdc_sv39_hpdcache ?
  • Shall we change the configuration in the Github CI's from cv64a6_imafdc_sv39 to cv64a6_imafdc_sv39_hpdcache ?

cfuguet avatar May 16 '24 07:05 cfuguet

I have no skin in the game but I would argue to at least keep the WTCache in the Thales CI until the Github CI is fixed (see #2050). I understand that you do not want to test a config not useful to you in your CI. However, we have had the WBcache in the Github CI for years (somehow it is no longer there now, but IMHO the WBcache config should be tested in some CI). Don't get me wrong, the Thales CI is amazing, but we exclusively rely on it for PRs and if you want to remove configurations from it, I would strongly suggest to sanity-check these at least in the Github CI.

In my personal opinion, removing tests/configurations from the CI is nonsensical. The only valid reason to remove a config from the CI is to deprecate it which is currently not the case.

Moschn avatar May 16 '24 09:05 Moschn

Yes, I agree with your point.

I'm asking myself if one of the (technical) issues currently is that we are running the same scripts (smoke-tests) in both the Github CI's and Thales CI's. But in Thales CI's there are additional tests running and even those smoke-tests are run multiple times because they are run in different test environments (Verilator testbench and UVM testbench using VCS).

May be we should have two different test scripts. One test script for the Github CI's that runs some tests with the 3 different supported caches (because as @Moschn suggests there is no official deprecation of any of them), and another in the Thales CI's with the configurations relevant to Thales.

cfuguet avatar May 16 '24 10:05 cfuguet

Do you want me to propose a script running some tests on the 3 caches that we can use in the Github's CI ?

cfuguet avatar May 16 '24 13:05 cfuguet

For information, we plans to test (only) the 2 Thales configs in Thales CI: one is 32bit, one is 64bits both with HPDCache. FPGA and ASIC synthesis will be done. As consequence the smoke-tests.sh will only test these two configurations. That is why testing the three caches in Github CI is useful. Feel free.

JeanRochCoulon avatar May 16 '24 13:05 JeanRochCoulon

For information, we plans to test (only) the 2 Thales configs in Thales CI: one is 32bit, one is 64bits both with HPDCache. FPGA and ASIC synthesis will be done. As consequence the smoke-tests.sh will only test these two configurations. That is why testing the three caches in Github CI is useful. Feel free.

Ok great ! Thank you @JeanRochCoulon for the information (the HPDcache is very happy by the way).

In parallel, I will propose a script for the Github's CI testing all caches with a limited number of tests to keep the latency low.

cfuguet avatar May 16 '24 13:05 cfuguet

I pushed a first proposal for the Github's CI. It runs the following configurations:

  • cv64a6_imafdc_sv39_hpdcache
  • cv64a6_imafdc_sv39_wb
  • cv64a6_imafdc_sv39
  • cv32a65x

However, the number of tests is limited. The selected tests allow to test arithmetic, memory and branch/jump instructions and a complete but very simple application (hello_world). In addition, the riscv-arch-test suite is executed with the cv64a6_imafdc_sv39_hpdcache and cv32a65x configurations. In the case of 64 bits configurations, I selected the unitary tests that enable the MMU.

The tests for a given configuration are run sequentially but all configurations are run in parallel to reduce the overall latency.

For 64 bits configurations:

  • riscv-arch-test (only in the cv64a6_imafdc_sv39_hpdcache configuration)
  • rv64ui-v-add
  • rv64ui-v-ld
  • rv64ui-v-sd
  • rv64ui-v-beq
  • rv64ui-v-jal
  • hello_world.c

For 32 bits configurations:

  • riscv-arch-test
  • rv32ui-p-add
  • rv32ui-p-lw
  • rv32ui-p-sw
  • rv32ui-p-beq
  • rv32ui-p-jal
  • hello_world.c

What do you think @JeanRochCoulon, @Moschn ?

cfuguet avatar May 16 '24 16:05 cfuguet

:exclamation: There is a problem regarding the Spike version. All tests are passing but actually, tests are not being run because of a check in the Spike version by the cva6.py script. I saw the same issue in other currently active pull requests (so it is not related to the modifications in this PR).

@JeanRochCoulon do you have an idea about this issue ?

I think it is because the "cached" installation of Spike is old with respect to the one in the core-v-verif repository. I do not know how to flush the cache of Github VMs.

cfuguet avatar May 16 '24 17:05 cfuguet

Ok, I found how to flush the Spike cache. I rerun the jobs, and lets see.

cfuguet avatar May 16 '24 18:05 cfuguet

Well this is working now. It took a total of 32 minutes to run which is the latency of the riscv-arch-tests. The latency of all other tests is completely hidden by the latter (because they are all run in parallel).

This means that we can still add some unitary tests for the different configurations in order to balance a little bit the latencies.

Let me know what do you think.

:exclamation: By the way, I notice something that we need to be careful of. When we do multiple calls to the cva6.py script in a given script, we need to check each time for the exit status, and at the end of the script exit with an error if some of the calls failed. Otherwise, only the status of the last command in the script is returned. This means that, if you count on the return status of the script to determine in the CI if the tests failed or not, you may overlook some errors. This was what it was happening with the Spike version error. The CI was passing but the tests were not actually run. Hence, I added in the tests scripts the check of the return status. Of course, other solution is to add an additional step and parse all logs, I think this is the strategy you do in Thales CI's.

cfuguet avatar May 17 '24 06:05 cfuguet

:heavy_check_mark: successful run, report available here.

github-actions[bot] avatar May 17 '24 12:05 github-actions[bot]

Hello @cfuguet @AEzzejjari conditioned the assertions and coverage in AXI agent, that's why you do not need to comment it into your PR. Please rebase it to give me the opportunity to merge it.

JeanRochCoulon avatar May 17 '24 21:05 JeanRochCoulon

Hello @cfuguet @AEzzejjari conditioned the assertions and coverage in AXI agent, that's why you do not need to comment it into your PR. Please rebase it to give me the opportunity to merge it.

Ok ! Now it is done. Thanks !

cfuguet avatar May 18 '24 14:05 cfuguet

:heavy_check_mark: successful run, report available here.

github-actions[bot] avatar May 18 '24 14:05 github-actions[bot]

I approve this PR. But I would be pleased to get the @MarioOpenHWGroup, who recently setup the Github Actions CI, feedback before merging. Mario are you ok ?

JeanRochCoulon avatar May 18 '24 19:05 JeanRochCoulon

Sorry, I was offline the past few days. Thanks @cfuguet for the effort!

Moschn avatar May 22 '24 06:05 Moschn