Some SV48 bins are missing coverage (might be some bug in wallyTracer)
I have developed SV48 tests, but some bins related to faults on instruction fetch don't show coverage. For example, sv48_reserved_rwx_pte_S_mode.S tests for faults in case of reserved RWX for levels 0-3, but gives the following coverage
Cross PTE_res_rwx_s_i_exec
bin <leaflvl_noexec_s,kilo,sv48,ins_page_fault,set> 1 - Covered
bin <leaflvl_exec_s,kilo,sv48,ins_page_fault,set> 1 - Covered
bin <leaflvl_exec_s,giga,sv48,ins_page_fault,set> 1 - Covered
bin <leaflvl_exec_s,mega,sv48,ins_page_fault,set> 1 - Covered
bin <leaflvl_noexec_s,tera,sv48,ins_page_fault,set> 1 - Covered
bin <leaflvl_exec_s,tera,sv48,ins_page_fault,set>
bin <leaflvl_noexec_s,giga,sv48,*,*> 0 1 1 ZERO
bin <leaflvl_noexec_s,mega,sv48,*,*> 0 1 1 ZERO
The same permissions are being checked for all the levels, but giga and mega pages are lacking coverage. The tests executes in the sequence of tera, giga, mega and kilo. If I change it to giga, tera, mega and kilo, the coverage becomes
Cross PTE_res_rwx_s_i_exec
bin <leaflvl_noexec_s,kilo,sv48,ins_page_fault,set> 1 - Covered
bin <leaflvl_exec_s,kilo,sv48,ins_page_fault,set> 1 - Covered
bin <leaflvl_noexec_s,giga,sv48,ins_page_fault,set> 1 - Covered
bin <leaflvl_exec_s,giga,sv48,ins_page_fault,set> 1 - Covered
bin <leaflvl_exec_s,mega,sv48,ins_page_fault,set> 1 - Covered
bin <leaflvl_exec_s,tera,sv48,ins_page_fault,set> 1 - Covered
bin <leaflvl_noexec_s,mega,*,*,*> 0 1 2 ZERO
bin <leaflvl_noexec_s,tera,*,*,*> 0 1 1 ZERO
The sort of weird behavior occured while we were working on SV32, which Huda Sajjad fixed (it was some issue in the wallyTracer.sv). Some other coverpoints are showing similar behavior. Like sv48_canonical_S_mode.S is generating the following report
Cross sv48_canonical_exec_s
bin <leaflvl_s,giga,not_zero_and_not_all_ones,sv48,ins_page_fault,set> Covered
bin <*,kilo,*,*,*,*> 0 1 1 ZERO
bin <*,mega,*,*,*,*> 0 1 1 ZERO
bin <*,tera,*,*,*,*> 0 1 1 ZERO
All these tests are placed on my fork https://github.com/Zain2050/riscv-arch-test/tree/sv48/riscv-test-suite/rv64i_m/vm_sv48/src.
The tests are working fine on Sail. I also used lockstepverbose and $display to observe the signals. Lockstepverbose shows that an instruction fetch exception occured, but the signals for the missing bins are not correct. For example, the missing bins show a physical address of 235 which is not even in our address range. This lead to me to think that the tracer isn't propagating the signals correctly.
@rosethompson I looked at your recommended changes in WallyTracer. The only one that worked was changing SelHPTW to ~GatedStallW only for the memory stage. It fixed the issue for sv48, but the coverage dropped for sv32. This seemed very strange, therefore I looked at its waveform to observe what's actually going on. Turn out there's an extra TLB miss for the missing bins.
The miss is supposed to be on fetch only (JALR), but we're getting another miss prior to it.\
And then after it we get some stalls and the correct TLB miss for JALR.
I don't know why we're getting two misses. lockstepverbose is showing only one miss and only one tlb entry being created. For SV48, we get correct PTE, VA & PA values in the latter page table walk, while for SV32, we are getting correct values in the prior walk and wrong values in the second one. That's why tracer is only working for either RV32 or RV64. Current tracer is grabbing the previous page table walk values, (working for sv32), while the new changes are grabbing the latter page table walk values (working for sv48).
Either we can add conditionals in the tracer to keep it different for rv32 and rv64. Or I came up with another solution. We can do this for memory stages
flopenrc #(P.XLEN) IVAdrWReg (clk, reset, 1'b0, SelHPTW | (FlushM & ~FlushW), IVAdrM, IVAdrW);
In this scenerio were getting few cycles with the previous values overlapping with correct Mcause value and afterwards it changes to new page table walk values. Therefore, it's working for both rv32 and rv64.
@rosethompson What do you say about this?
@Zain2050 can you post the .elf that causes the two TLB misses? You'll probably have to gzip it first.
@Zain2050 Can you include the elf so I can reproduce the bug? I bet I can debug this really fast if I have the elf.
Fixed with PR #1497