riscv-v-spec icon indicating copy to clipboard operation
riscv-v-spec copied to clipboard

precise vector traps

Open Zissi-Lei opened this issue 3 years ago • 1 comments

Hi, on the page 30 of the spec, I see "For implementations with precise vector traps, exceptions on indexed-unordered stores must also be precise.". I want to know how indexed-unordered stores easures precise traps? Thanks for your time.

Zissi-Lei avatar Dec 20 '21 14:12 Zissi-Lei

Well, in the case where the instruction doesn't trigger any exceptions, the values may be stored in an arbitrary order.

However, if an exception will trigger, then the rules in "Exception Handling" and "Precise vector traps" come into play, and in particular:

the vstart CSR contains the element index on which the trap was taken

and:

  1. any operations within the trapping vector instruction affecting result elements preceding the index in the vstart CSR have committed their results

  2. no operations within the trapping vector instruction affecting elements at or following the vstart CSR have altered architectural state except if restarting and completing the affected vector instruction will nevertheless produce the correct final state.

So at the very least, the implementation must wait for the writes of the earlier elements to complete before actually taking the trap.

Furthermore,

In idempotent memory regions, vector store instructions may have updated elements in memory past the element causing a synchronous trap. Non-idempotent memory regions must not have been updated for indices equal to or greater than the element that caused a synchronous trap during a vector store instruction.

(Idempotent memory is just ordinary memory where you can write values and read them back, whereas non-idempotent memory does things when written to/read from.)

Which means that, if the implementation supports paging and non-idempotent memory, it can't just start unordered virtual writes willy-nilly; before actually writing out a given element, it needs to know one of two things:

  • None of the prior-element writes will fault
  • This write is to idempotent memory

Both of these determinations tend to require page-table lookups: the former to make sure the target pages are writable (and some fiddly stuff), the latter to verify that the target address refers to idempotent memory.

Now, none of this is actually specific to the "indexed-unordered" mode: all four modes are subject to the above rules, and the "unit-stride" and "strided" modes are equally unordered. ("indexed ordered" naturally has to issue ordered writes in any case.)

What is specific to "indexed-unordered" is that the virtual addresses don't form a simple linear sequence, so implementers might be tempted to try issuing writes in (for example) address order to improve locality of reference, without stopping to consider what could go wrong.

My conclusion is that the sentence you quote,

For implementations with precise vector traps, exceptions on indexed-unordered stores must also be precise.

is intended to make it absolutely clear that the rules apply even in this case, and to remind implementers to be careful here.

Especially since it would seem perfectly reasonable for the spec to have allowed "indexed-unordered" stores to, in fact have done non-idempotent writes beyond vstart when taking an exception; any cases where that would be problematic could presumably just use "indexed-ordered" stores instead with perhaps a slight performance hit. (Since "unit-stride" and "strided" modes have no ordered equivalent, the same argument does not apply to them.)

SamB avatar Mar 28 '22 18:03 SamB