cva6 icon indicating copy to clipboard operation
cva6 copied to clipboard

Slow Dhrystone on FPGA

Open manox opened this issue 4 years ago • 8 comments

I have tested systems generated with Chipyard on an FPGA (VCU118). With Rocket and Boom I also get plausible results here with Dhrystone and Coremark. However, with CVA6, the results for Dhrystone are relatively poor (~ 0.7 DMIPS/Mhz) and depend on the l2 cache (which is not the case with rocket/boom). Coremark is okay (> 2 Coremark/Mhz).

I'm happy for suggestions, thanks.

manox avatar Jan 14 '21 10:01 manox

Hey @manox,

i tried the same thing and got similar results. Cache dependency hints to a memory interface problem.

RaphaelKlink avatar Mar 15 '21 07:03 RaphaelKlink

Hi @RaphaelKlink, I know, it's me, Mark. ;)

manox avatar Mar 15 '21 10:03 manox

I have tested systems generated with Chipyard on an FPGA (VCU118). With Rocket and Boom I also get plausible results here with Dhrystone and Coremark. However, with CVA6, the results for Dhrystone are relatively poor (~ 0.7 DMIPS/Mhz) and depend on the l2 cache (which is not the case with rocket/boom). Coremark is okay (> 2 Coremark/Mhz).

I'm happy for suggestions, thanks.

Hi, I tried to test Coremark on an FPGA (genesys2), but I meet illegal instructions error when running Coremark. How did you solve this problem?

Wcm926 avatar Apr 15 '21 06:04 Wcm926

We did not used the Ariane SoC but the Chipyard and OpenPiton Frameworks. Both of them are able to run coremark and dhrystone baremetal on the FPGA. As mentioned in this Issue the Chipyard has unusually low performance values.

RaphaelKlink avatar Apr 20 '21 08:04 RaphaelKlink

We did not used the Ariane SoC but the Chipyard and OpenPiton Frameworks. Both of them are able to run coremark and dhrystone baremetal on the FPGA. As mentioned in this Issue the Chipyard has unusually low performance values.

Hi, did you test systems generated with Chipyard on an FPGA (VCU118) and coremark is around 2 Coremark/Mhz

Wcm926 avatar Apr 26 '21 10:04 Wcm926

@manox I'd be willing to corroborate your scores, but parameters retrieved by TestHarness aren't set when running the Config as

Cva6VCU118Config extends Config(
new WithVCU118Tweaks++
new chipyard.CVA6Config)

I assume you changed ExtMem in the CVA6 Chisel Module Implementation. What other changes did you do when generating the core?

michael-etzkorn avatar Aug 23 '21 16:08 michael-etzkorn

This is just a guess, but I believe Chipyard uses the write-through L1 cache variant of CVA6 (https://github.com/ucb-bar/cva6-wrapper/blob/139741a584d7e3c0446db592b5d99529bd6cf9fa/src/main/resources/vsrc/Makefile#L132). That explains why the performance depends on the L2 cache.

Moschn avatar Aug 24 '21 09:08 Moschn

@manox I'd be willing to corroborate your scores, but parameters retrieved by TestHarness aren't set when running the Config as

Cva6VCU118Config extends Config(
new WithVCU118Tweaks++
new chipyard.CVA6Config)

I assume you changed ExtMem in the CVA6 Chisel Module Implementation. What other changes did you do when generating the core?

I used a relatively old version of Chipyard in which the FPGA flow did not yet exist. I then brought this to the FPGA in my own Vivado project. Therefore I can not say anything about the mentioned config, sorry.

manox avatar Aug 31 '21 11:08 manox

Hi @manox, @Moschn, @michael-etzkorn, @Wcm926 and @RaphaelKlink, thanks for your interest in CVA6. This issue has not been updated in ~1.5 years, so I will assume it is resolved and will close this issue. There is a related issue (#1035) which you can track. If you are still having trouble, please feel free to open another.

MikeOpenHWGroup avatar Feb 17 '23 17:02 MikeOpenHWGroup