sail-riscv icon indicating copy to clipboard operation
sail-riscv copied to clipboard

Create generalized interface for repetitive load/store instructions

Open nadime15 opened this issue 9 months ago • 9 comments

Problem

Many current instructions and probably many future ones are based on load and store operations. I have noticed that the same code is repeated over and over again. Some examples are in Zcmp (#730), Zcmt (#757), Zilsd/Zcl (#765) and in extensions that are already implemented and lead to:

  • Redundant code
  • Higher maintenance burden
  • Increased risk of inconsistencies when making changes
  • High number of lines of code (In case code ends up in the specification)

Proposed solution

Create a generalized interface or template mechanism that can handle the common aspects of load/store operations.

This abstraction would accept common arguments, like AccessType, Aquire, Release etc.

Expected benefits

  • Reduced code duplication
  • Improved maintainability
  • Easier implementation of new load/store variants
  • More consistent behavior across related instructions
  • Better testability of core load/store logic

As we also plan to integrate Sail Code into the specification we would also reduce the number of lines for all affected instructions (The generalized interface will be provided as part of the appendix)

Implementation approach

I thought of something like I did in Zcmt (#757) but only in better (return error cause + Retired Illegal, in case you want to take further action)

type target_address = bits(xlen)

union FJT_Result = {
  FJT_Success : target_address,
  FJT_Failure : unit
}

function fetch_jump_table(table_address : bits(xlen)) -> FJT_Result = {
  /* Fetching jump table address needs execute permission */
  match ext_data_get_addr_from_bits(table_address, Execute(), xlen_bytes) {
    Ext_DataAddr_Error(e)  => { ext_handle_data_check_error(e); FJT_Failure() },
    Ext_DataAddr_OK(vaddr) => {
      if   check_misaligned(vaddr, size_bytes(xlen_bytes))
      then { handle_mem_exception(vaddr, E_Load_Addr_Align()); FJT_Failure() }
      else match translateAddr(vaddr, Execute()) {
        TR_Failure(e, _) => { handle_mem_exception(vaddr, e); FJT_Failure() },
        TR_Address(paddr, _) =>
          match mem_read(Execute(), paddr, xlen_bytes, false, false, false) {
            Ok(result) => { FJT_Success(result) },
            Err(e)     => { handle_mem_exception(vaddr, e); FJT_Failure() },
          }
      }
    }
  }
}

I would like to discuss the idea of generalizing load and store operations to reduce code redundancy, as many instructions essentially perform the same function. Before I start working on this, I would like to confirm whether this approach makes sense and if there is agreement.

nadime15 avatar Mar 04 '25 18:03 nadime15

Yeah I've thought about this before. Especially wrt. unaligned loads/stores.

In theory it's not too hard and we have implemented a version of this in our fork. The showstopper is the call to announce the effective address for stores. As I understand it it has to happen before the store value is read, but Sail doesn't have first class functions so you can't pass a lambda function or whatever into your generic store to get it to read the register at the right point.

It only matters for the concurrency model which we aren't using so I don't really know too much about it.

But yeah I agree it's annoying and it makes the Sail code look more complicated than it is.

Timmmm avatar Mar 04 '25 18:03 Timmmm

I did this in https://github.com/riscv/sail-riscv/pull/467, creating generic read/write functions that handled misaligned accesses better. In that PR I just ignored the effective address thing.

You can use scattered functions to work around the lack of lambdas if you want, but it does make the code a little bit more complex. The general pattern would be:

enum clause E = A

function clause foo() = {
  ...
  bar(A)
}

function clause baz(A, x) = ...

then in the file with bar it calls

function bar(A) = {
  let x = ...;
  baz(A, x)
  ...
}

this is effectively the same as passing λx. baz(A, x) as a first class function.

Alasdair avatar Mar 04 '25 19:03 Alasdair

Ah clever! So... can we ignore the effective address thing? I don't really care about it personally, but presumably some people do? How much does it matter?

Timmmm avatar Mar 04 '25 20:03 Timmmm

Right now nobody is really using it. We could work around it by having the effective address announce take an explicit set of register dependencies if needed, so I don't think it should get in the way of writing the spec in the cleanest way.

Alasdair avatar Mar 04 '25 21:03 Alasdair

Okay, this looks good! I’m just wondering if it might make sense to generalize LOAD/STORE even further by passing the effective addresses directly. The reason I ask is that Zcmt, for example, reads directly from instruction memory without storing the address in a register. You can’t use LOAD directly, nor can I use the current implementation of LOAD because it relies on ext_data_get_addr.

Another point is that, while I'm not sure how much it matters right now, the imm value for STORE and LOAD is limited to a maximum of 12 bits. I understand this is how the base load and store are defined, but it means that all instructions calling these operations must respect this limit as well. This could be fine, but I wonder if, at some point, future instructions might be defined to allow larger offsets.

We could define some wrapper functions that use our current LOAD/STORE functions and call something like LOAD_RAW (I am pretty sure there is a better name 😄) and define a layered approach.

nadime15 avatar Mar 05 '25 17:03 nadime15

generalize LOAD/STORE even further by passing the effective addresses directly

I like this.

I just updated my pr of zilsd/zclsd (all act tests now passed). Taking zilsd_ld as an example, it will perform two loads internally. Between the loads, the rs1 register may be modified if rs1 == rd . I had to create a temp var to save rs1_val(the base address), and create a corresponding load function that directly passes the address.

function clause execute ZILSD_LD(imm, rs1, rd) = {
  if rd != zreg then {
    let base_val = X(rs1);
    let _ = load_imm(imm, base_val, rd, WORD);
    load_imm(imm+4, base_val, rd+1, WORD)
  } else {
    RETIRE_SUCCESS
  }
}

trdthg avatar Apr 08 '25 12:04 trdthg

That would break the current sail-cheri-riscv.

jrtc27 avatar Apr 08 '25 13:04 jrtc27

Yeah I think you want to do something like this (with a hypothetical read() function that handled CHERI etc.

match read(rs1, imm, WORD) {
  Ok(val_0) => {
    match read(rs1, imm + 4, WORD) {
      Ok(val_1) => {
        X(rd) = val_0;
        X(rd+1) = val_1;
...

Or if the read() interface were augmented to support a maximum memory operation size (probably a good idea) you could do a single call:

match read(rs1, imm, WORD, /* max memory op size = */ WORD) {
   Ok(val) => {
     X(rd) = val[31..0];
     X(rd+1) = val[63..32];
...

I think we probably do want such a function because handling pointer masking, CHERI, triggers, etc. is quite tedious if you have to do it everywhere that accesses memory, and it's the kind of detail that you'd want to hide e.g. for including code in the spec.

Timmmm avatar Apr 08 '25 13:04 Timmmm

The debug module allows memory accesses with the exact same view and permissions as normal loads/stores on the selected hart. With the current interface this wont work, since we still pass register + offset as a function argument (it also allows direct memory access without vaddr->paddr).

I would like to split vmem_read and vmem_write, and wrap them around a new vmem_write_* function that just takes the effective address. But ext_data_get_addr() is an issue: in my use case, if I pass a direct address, I would skip the checks it performs. Since debug module memory accesses are supposed to behave as if they came from a halted hart, I have to take those checks into account.

So, two questions:

  1. Any objections if I break vmem_read and vmem_write apart and wrap them around a vmem_* function that allows direct access? Or should we just pass the effective address everywhere?

  2. Its not the first time we had to work around ext_data_get_addr(). I would be in favor of finally removing it. If the CHERI spec ever makes it into this model, we can revisit this part again if necessary.

nadime15 avatar Sep 29 '25 15:09 nadime15