riscv-v-spec icon indicating copy to clipboard operation
riscv-v-spec copied to clipboard

Artificially lowering VLEN

Open SamB opened this issue 3 years ago • 4 comments
trafficstars

It seems like it would be useful for privileged code to be able to ask a hart to use an effective VLEN that is lower than the actual length of its vector registers.

Just off the top of my head, this could be useful for:

  • Allowing a thread/VM to be migrated between harts with different innate VLEN
  • Controlling the space required for the vector context
  • Testing that code actually works with a variety of VLEN values (without simulator slowdown/recompiling your CPU)

Does this make sense?

SamB avatar Feb 22 '22 17:02 SamB

I agree this makes sense to allow at least as a possibility on some implementations, even if it is not required to support it -- though that doesn't seem to be hard.

section 3.6 says vlenb is a read-only register. It could perhaps be made WARL. Or a new CSR could be added. The primary effect would be to limit the vl returned by vsetvl{i}, but the effect (if any) on the whole register load/store instructions would need to be considered.

brucehoult avatar Dec 17 '22 11:12 brucehoult

It’s a reasonable feature request, but it turns out to be microarchitecturally more challenging than it might seem because of LMUL. Artificially reducing VLMAX and calling it a day doesn’t do the right thing: you’d be using the wrong bits in the regfile (e.g. the second half of v0 instead of the first half of v1) for the latter part of the vector. This error becomes architecturally visible if you then access v1 under a smaller LMUL. It’s possible to do VLEN trimming correctly, of course—it’s just more painful than tweaking the behavior of vsetvl.

The trimming CSR should be a privileged CSR, since the context size isn’t actually reduced if userspace is allowed to expand it. Might need both [V]S and H versions to support VM migration across different VLENs, if that feature is deemed important.

aswaterman avatar Dec 17 '22 20:12 aswaterman

Note that SVE has this feature [1]. They don't have LMUL of course.

Even if it had a performance impact (when used) it could still be better / faster / more compatible than using an emulator for testing.

[1] "Privileged Exception levels can use the LEN fields of the scalable vector control registers ZCR_El1, ZCR_El2, and ZCR_El3 to constrain the vector length at that Exception level and at less privileged Exception levels".

Here LEN is the number of multiples of 128 bits. Implementations can choose whether or not to support non-powers of two! Ugh. But the feature of reducing the vector length seems to be mandated.

brucehoult avatar Dec 18 '22 23:12 brucehoult

Fortunately, it's not really necessary that such a facility be even conditionally mandated by any of the other vector extensions; it basically needs to be done at a higher priviledge level to be effective anyway.

Changing VLEN in an existing thread would be fraught with peril in any case, since both the current register contents and any saved register contents would only really make sense with (a) the same VLEN or (b) a larger VLEN and extreme care, I guess? It looks like it's really important to keep in mind the implications for context save/restore when designing this feature (since the kernel/VMM/hypervisor would either be running with a possibly-different VLEN than the client code, or would have to change VLEN on the fly). Should make for some interesting NOTEs.

SamB avatar Feb 07 '23 17:02 SamB