riscof icon indicating copy to clipboard operation
riscof copied to clipboard

rv64i_m ebreak.S - DUT and Ref run different test programs

Open Peter-Herrmann opened this issue 1 year ago • 1 comments

I am running an RV64IMZicsr_Zifencei DUT against SAIL and I am struggling to understand why the elf's generated for the Ref and DUT differ on this single instruction:

00000000800008fc <chk_Mtval>:
    800008fc:   00f57513            andi    a0,a0,15        # a0 contains mcause (0x3 for ebreak)
    80000900:   00000437            lui	    s0,0x0          # s0 = 0
    80000904:   00040413            mv      s0,s0           # - 
    80000908:   00b41413            slli    s0,s0,0xb       # - 
    8000090c:   00040413            mv      s0,s0           # - 
    80000910:   00b41413            slli    s0,s0,0xb       # - 
    80000914:   02c40413            addi    s0,s0,44        # s0 = 0x2c
    80000918:   00a41413            slli    s0,s0,0xa       # s0 = 0x2c << 0xa = 0xB000
    8000091c:   0fb40413            addi    s0,s0,251       # s0 = 0xB0FB       <- This instruction's immediate is different between DUT (0x0FB) and REF (0x0F3)
    80000920:   00a45433            srl     s0,s0,a0        # s0 = 0xB0FB >> 0x3 = 0x161F (DUT) or 0x161E (Ref)
    80000924:   03f41413            slli    s0,s0,0x3f      # s0 = 0x161F << 0x3f = 0x8000000000000000(DUT) or 0x0 (Ref)
    80000928:   04045c63            bgez    s0,80000980 <sv_Mtval>

This section of code checks mcause against a mask, 0xB0FB or 0xB0F3.

  • If the mask bit associated with the given mcause value is set (i.e 3rd bit for mcause = 3), then it proceeds to check if mtval holds the PC for the exception thrown, writing the offset from the beginning of the test to the signature file.
  • If the same bit is clear, then the test assumes that mtval contains zero and writes that to the signature.

This behavior seems to correspond to the mtval_update config in the riscv-config, but changing this value has no effect.

Essentially what I am running into is that the Ref test executable seems to always assume that mtval=0 after EBREAK (without checking), while the DUT executable expects mtval=PC after EBREAK.

If I modify the DUT's elf so that this mask is the same as the instruction in the Ref's elf, the test passes, otherwise the test fails. Likewise, If I modify the Ref's elf file so that the mask matches the DUT, the test passes.

  • If both tests have 8000091c: 0fb40413, the signatures match
  • If both tests have 8000091c: 0f340413, the signatures match
  • If the tests have different values here (specifically if bit 3 is different) the signatures will not match

It does not seem correct to me that the DUT and Ref could possibly have different expectations during the test like this.

My isa.yaml:

hart_ids: [0]
hart0:
  ISA: RV64IMZicsr_Zifencei
  physical_addr_sz: 56
  User_Spec_Version: '2.3'
  supported_xlen: [64]
  hw_data_misaligned_support: false

and my platform.yaml:

mtime:
  implemented: false
mtimecmp:
  implemented: false
nmi:
  label: nmi_vector
reset:
  label: reset_vector

And here is the source for my project if needed: rv-application-rough-RV64IMZicsr_Zifencei.zip

Peter-Herrmann avatar May 09 '24 05:05 Peter-Herrmann

@Peter-Herrmann Thanks for your detailed explanation, it's helping me debug my own issues immensely. The mtval mask gets set here by the SET_REL_TVAL_MSK macro expansion.

I looked inside your zip; it looks like your SAIL plugin is a copy of the one provided by the neorv32 RISCOF plugin. In that plugin as well as the neorv32 plugin proper are the following lines:

      # Override default exception relocation list - remove EBREAK exception since NEORV32 clears MTVAL
      # when encountering this type of exception (permitted by RISC-V priv. spec.)
      print("<plugin-neorv32> overriding default SET_REL_TVAL_MSK macro (removing BREAKPOINT exception)")
      neorv32_override  = ' \"-DSET_REL_TVAL_MSK=(('
      neorv32_override += '(1<<CAUSE_MISALIGNED_FETCH) | '
      neorv32_override += '(1<<CAUSE_FETCH_ACCESS)     | '
#     neorv32_override += '(1<<CAUSE_BREAKPOINT)       | '
      neorv32_override += '(1<<CAUSE_MISALIGNED_LOAD)  | '
      neorv32_override += '(1<<CAUSE_LOAD_ACCESS)      | '
      neorv32_override += '(1<<CAUSE_MISALIGNED_STORE) | '
      neorv32_override += '(1<<CAUSE_STORE_ACCESS)     | '
      neorv32_override += '(1<<CAUSE_FETCH_PAGE_FAULT) | '
      neorv32_override += '(1<<CAUSE_LOAD_PAGE_FAULT)  | '
      neorv32_override += '(1<<CAUSE_STORE_PAGE_FAULT)   '
      neorv32_override += ') & 0xFFFFFFFF)\" '
      self.compile_cmd += neorv32_override

In your zip file, only the SAIL plugin has these lines; they're absent from your Lucid64 plugin. This results in different masks, and I'm guessing is the reason your DUT and ref diverge.

With that said, I would've figured that mtval_update would set this mask for you when generating compiler command lines, but it doesn't appear to do anything.

cc: @stnolting: Did you ever get feedback from anyone working on RISCOF, the arch-tests, or the config parser on whether adding define overrides is the correct way to change the mtval mask checked in the tests? Or did you run into the same exact problem as me with mtval mismatch, forcefully solve the problem with a #define override, and continue on your day :)? I'm going to steal your approach absent any other feedback and hope for the best :D.

cr1901 avatar Jan 15 '25 14:01 cr1901

Thanks @cr1901 - this explains my issue. Closing the issue, since it has been addressed, but there does appear to be something wrong with the mtval_update funcitonality for riscv-config.

Peter-Herrmann avatar Oct 23 '25 21:10 Peter-Herrmann