OpenPiton configuration of CVA6 cannot boot Linux on UltraScale+ FPGA
The cv64a6_imacfdc_sv39_openpiton_config_pkg.sv configuration of the CVA6 (also known as the OpenPiton configuration) cannot boot Linux. The specific behaviour is that the core hangs on the user mode boundary while booting Linux. A crash at the user mode boundary points to MMU behaviour, but this has not been validated.
The configuration used for booting Linux in the CI flow is cv32a6_imac_sv32_config_pkg.sv, also known as the the 10xEngineers configuration.
The difference between these configurations is listed in the Tables below:
Lines 1..47
The following are all scalar localparams:
| Lines | cv32a6_imac_sv32_config_pkg.sv | cv64a6_imacfdc_sv39_openpiton_config_pkg.sv |
|---|---|---|
| 13c13 | localparam CVA6ConfigXlen = 32; | localparam CVA6ConfigXlen = 64; |
| 15c15 | localparam CVA6ConfigRVF = 0; | localparam CVA6ConfigRVF = 1; |
| 26 | localparam CVA6ConfigBExtEn = 0; | |
| 27 | localparam CVA6ConfigHExtEn = 0; | localparam CVA6ConfigHExtEn = 0; |
| 28 | localparam CVA6ConfigVExtEn = 0; | localparam CVA6ConfigVExtEn = 1; |
| 41c42 | localparam CVA6ConfigDcacheByteSize = 32768; | localparam CVA6ConfigDcacheByteSize = 16384; |
| 42c43 | localparam CVA6ConfigDcacheSetAssoc = 8; | localparam CVA6ConfigDcacheSetAssoc = 4; |
| 45c46 | localparam CVA6ConfigDcacheIdWidth = 3; | localparam CVA6ConfigDcacheIdWidth = 1; |
| 46c47 | localparam CVA6ConfigMemTidWidth = 4; | localparam CVA6ConfigMemTidWidth = 2; |
Lines 66..67
Localparams of type cache_type_t:
| Lines | cv32a6_imac_sv32_config_pkg.sv | cv64a6_imacfdc_sv39_openpiton_config_pkg.sv |
|---|---|---|
| 66c67 | localparam config_pkg::cache_type_t CVA6ConfigDcacheType = config_pkg::HPDCACHE; | localparam config_pkg::cache_type_t CVA6ConfigDcacheType = config_pkg::WT; |
Lines 72..155
The following values are all members of localparam config_pkg::cva6_user_cfg_t cva6_cfg
| Lines | cv32a6_imac_sv32_config_pkg.sv | cv64a6_imacfdc_sv39_openpiton_config_pkg.sv |
|---|---|---|
| 74c75 | VLEN: unsigned'(32), | VLEN: unsigned'(64), |
| 92c93 | RVB: bit'(1), | RVB: bit'(CVA6ConfigBExtEn), |
| 93c94 | ZKN: bit'(1), | ZKN: bit'(0), |
| 122c123 | NOCType: config_pkg::NOC_TYPE_AXI4_ATOP, | NOCType: config_pkg::NOC_TYPE_L15_BIG_ENDIAN, |
| 123c124 | NrNonIdempotentRules: unsigned'(1), | NrNonIdempotentRules: unsigned'(2), |
| 124c125 | NonIdempotentAddrBase: 1024'({64'b0}), | NonIdempotentAddrBase: 1024'({64'b0, 64'b0}), |
| 125c126 | NonIdempotentLength: 1024'({64'h8000_0000}), | NonIdempotentLength: 1024'({64'b0, 64'b0}), |
| 146c147 | InstrTlbEntries: int'(2), | InstrTlbEntries: int'(16), |
| 147c148 | DataTlbEntries: int'(2), | DataTlbEntries: int'(16), |
| 148c149 | UseSharedTlb: bit'(1), | UseSharedTlb: bit'(0), |
To be clear, we only observed this issue on UltraScale+ FPGAs and we've actually had a user who ran on vcu118 or similar without the issue so I think there's a specific timing component to the issue
Today's CVA6 meeting: @Jbalkind: This works on Genesys 2, not on UltraScale+ boards, with the same source code for CVA6 and the OpenPiton mesh. When @Jbalkind mentions timings, he thinks about performance differences at the memory controller that would result on a different behaviour of the CPU. @JeanRochCoulon mentions that MMU was often in the critical path of the Genesys 2 board and recommends reviewing Vivado's logs. @fatimasaleem: 10xEngineers will deep dive into the issue. Thanks :-) Nicolas T also experienced a similar Linux booting issue when enabling dual-issue. No clue it's the same bug.
was there any update on this?
I tried a more recent version of the core in the last month or so. It had a new issue which I shared as #3115 but even with a fix of that it still hangs at the user mode boundary. I also got another patch or two from others (e.g. Capabilities Ltd folks have a patch for an unaligned ifetch issue) but didn't see further forward progress.
I'm hoping that switching to the HPDC config #3134 will help but I have tried similar changes previously and not seen it get solved. I think there's still some weird edge case in the core at the user mode boundary. Personally as a professor I don't have the time to dig in and debug with ILA etc but would happily help anyone else who can dedicate the time.
we do have people that can help. I'll get back to you.
Just to provide an update, we are currently integrating your fix in #3115 , At the same time, we had to move the linux pa address space. We are working on reaching a stable linux output, and then I guess our HW team will provide patches (I am not sure about timelines tho)
Oh that's great! It would be good to be in touch with you by email additionally so I can discuss the details. Feel free to give me a nudge