sail-riscv
sail-riscv copied to clipboard
Create generalized interface for repetitive load/store instructions
Problem
Many current instructions and probably many future ones are based on load and store operations. I have noticed that the same code is repeated over and over again. Some examples are in Zcmp (#730), Zcmt (#757), Zilsd/Zcl (#765) and in extensions that are already implemented and lead to:
- Redundant code
- Higher maintenance burden
- Increased risk of inconsistencies when making changes
- High number of lines of code (In case code ends up in the specification)
Proposed solution
Create a generalized interface or template mechanism that can handle the common aspects of load/store operations.
This abstraction would accept common arguments, like AccessType, Aquire, Release etc.
Expected benefits
- Reduced code duplication
- Improved maintainability
- Easier implementation of new load/store variants
- More consistent behavior across related instructions
- Better testability of core load/store logic
As we also plan to integrate Sail Code into the specification we would also reduce the number of lines for all affected instructions (The generalized interface will be provided as part of the appendix)
Implementation approach
I thought of something like I did in Zcmt (#757) but only in better (return error cause + Retired Illegal, in case you want to take further action)
type target_address = bits(xlen)
union FJT_Result = {
FJT_Success : target_address,
FJT_Failure : unit
}
function fetch_jump_table(table_address : bits(xlen)) -> FJT_Result = {
/* Fetching jump table address needs execute permission */
match ext_data_get_addr_from_bits(table_address, Execute(), xlen_bytes) {
Ext_DataAddr_Error(e) => { ext_handle_data_check_error(e); FJT_Failure() },
Ext_DataAddr_OK(vaddr) => {
if check_misaligned(vaddr, size_bytes(xlen_bytes))
then { handle_mem_exception(vaddr, E_Load_Addr_Align()); FJT_Failure() }
else match translateAddr(vaddr, Execute()) {
TR_Failure(e, _) => { handle_mem_exception(vaddr, e); FJT_Failure() },
TR_Address(paddr, _) =>
match mem_read(Execute(), paddr, xlen_bytes, false, false, false) {
Ok(result) => { FJT_Success(result) },
Err(e) => { handle_mem_exception(vaddr, e); FJT_Failure() },
}
}
}
}
}
I would like to discuss the idea of generalizing load and store operations to reduce code redundancy, as many instructions essentially perform the same function. Before I start working on this, I would like to confirm whether this approach makes sense and if there is agreement.
Yeah I've thought about this before. Especially wrt. unaligned loads/stores.
In theory it's not too hard and we have implemented a version of this in our fork. The showstopper is the call to announce the effective address for stores. As I understand it it has to happen before the store value is read, but Sail doesn't have first class functions so you can't pass a lambda function or whatever into your generic store to get it to read the register at the right point.
It only matters for the concurrency model which we aren't using so I don't really know too much about it.
But yeah I agree it's annoying and it makes the Sail code look more complicated than it is.
I did this in https://github.com/riscv/sail-riscv/pull/467, creating generic read/write functions that handled misaligned accesses better. In that PR I just ignored the effective address thing.
You can use scattered functions to work around the lack of lambdas if you want, but it does make the code a little bit more complex. The general pattern would be:
enum clause E = A
function clause foo() = {
...
bar(A)
}
function clause baz(A, x) = ...
then in the file with bar it calls
function bar(A) = {
let x = ...;
baz(A, x)
...
}
this is effectively the same as passing λx. baz(A, x) as a first class function.
Ah clever! So... can we ignore the effective address thing? I don't really care about it personally, but presumably some people do? How much does it matter?
Right now nobody is really using it. We could work around it by having the effective address announce take an explicit set of register dependencies if needed, so I don't think it should get in the way of writing the spec in the cleanest way.
Okay, this looks good! I’m just wondering if it might make sense to generalize LOAD/STORE even further by passing the effective addresses directly. The reason I ask is that Zcmt, for example, reads directly from instruction memory without storing the address in a register. You can’t use LOAD directly, nor can I use the current implementation of LOAD because it relies on ext_data_get_addr.
Another point is that, while I'm not sure how much it matters right now, the imm value for STORE and LOAD is limited to a maximum of 12 bits. I understand this is how the base load and store are defined, but it means that all instructions calling these operations must respect this limit as well. This could be fine, but I wonder if, at some point, future instructions might be defined to allow larger offsets.
We could define some wrapper functions that use our current LOAD/STORE functions and call something like LOAD_RAW (I am pretty sure there is a better name 😄) and define a layered approach.
generalize LOAD/STORE even further by passing the effective addresses directly
I like this.
I just updated my pr of zilsd/zclsd (all act tests now passed). Taking zilsd_ld as an example, it will perform two loads internally. Between the loads, the rs1 register may be modified if rs1 == rd . I had to create a temp var to save rs1_val(the base address), and create a corresponding load function that directly passes the address.
function clause execute ZILSD_LD(imm, rs1, rd) = {
if rd != zreg then {
let base_val = X(rs1);
let _ = load_imm(imm, base_val, rd, WORD);
load_imm(imm+4, base_val, rd+1, WORD)
} else {
RETIRE_SUCCESS
}
}
That would break the current sail-cheri-riscv.
Yeah I think you want to do something like this (with a hypothetical read() function that handled CHERI etc.
match read(rs1, imm, WORD) {
Ok(val_0) => {
match read(rs1, imm + 4, WORD) {
Ok(val_1) => {
X(rd) = val_0;
X(rd+1) = val_1;
...
Or if the read() interface were augmented to support a maximum memory operation size (probably a good idea) you could do a single call:
match read(rs1, imm, WORD, /* max memory op size = */ WORD) {
Ok(val) => {
X(rd) = val[31..0];
X(rd+1) = val[63..32];
...
I think we probably do want such a function because handling pointer masking, CHERI, triggers, etc. is quite tedious if you have to do it everywhere that accesses memory, and it's the kind of detail that you'd want to hide e.g. for including code in the spec.
The debug module allows memory accesses with the exact same view and permissions as normal loads/stores on the selected hart. With the current interface this wont work, since we still pass register + offset as a function argument (it also allows direct memory access without vaddr->paddr).
I would like to split vmem_read and vmem_write, and wrap them around a new vmem_write_* function that just takes the effective address. But ext_data_get_addr() is an issue: in my use case, if I pass a direct address, I would skip the checks it performs. Since debug module memory accesses are supposed to behave as if they came from a halted hart, I have to take those checks into account.
So, two questions:
-
Any objections if I break
vmem_readandvmem_writeapart and wrap them around avmem_*function that allows direct access? Or should we just pass theeffective addresseverywhere? -
Its not the first time we had to work around
ext_data_get_addr(). I would be in favor of finally removing it. If the CHERI spec ever makes it into this model, we can revisit this part again if necessary.