riscv-v-spec
riscv-v-spec copied to clipboard
allow op-result and either source as agnostic values - Revisit agnosticism 2
@knightsifive opened #650 to allow zero as a tail-fill value.
The discussion noted that all values are not created equal for tail-agnostic values'
Specifically bleed-through of data from another concurrent process is not allowed. [Consider multiple harts sharing the same vector unit with unsafe renaming of register segments] The current plan of 1s fill or destination register undisturbed guards against this happen-stance. There are other valid candidates, but there has to be a limited set of them, or else the bleed-through cannot be detected.
There are specific candidates that can in some circumstances reduce the microarchitecture complexity.
-
the operation result: typically, when VLEN exceeds a hypothetical "SLEN", a section slice will process n elements in parallel generating results for the whole segment. When tu is stipulated and the Microarchitecture does register renaming, the tail elements in the section may need to be post populated with data from the original named destination register. When ta is specified the retrieval of the original values is not required as 1s can be substituted. However, generating the 1s cost something, and gating it a further cost. If the results of the operation on the source fields were allowed, the tail fill of the segment could be more efficient.
-
the source vector(s) value. As above, providing a value already at hand may be less expensive that generating a new one. The alu for the segment slice may already have a bypass to port in the data for tail undisturbed. The source (or one of the source) vector could be used for that value.
-
the source scalar value. Typically the scalar value is splatted across the lines/columns of the segment slice. It may be inexpensive to shove that value into the tail rather than 1s.
The other risk that was identified in #650 was malicious side channel information communicated in the pattern of fill values. This is possible under the current spec. Any combination of valid fill values [currently 1s and undisturbed values] is allowed. To reduce the effectiveness of this potentially malicious channel, a requirement that once a constant is chosen as fill [currently only 1s allowed] that constant must be used for the remainder of the fill. if more fill values than undisturbed and 1s are allowed , minor variations of this specification can balance off performance and exposure.
Adding a new alternative for agnostic carries a documentation/verification/compatibility cost. Forcing a 1 is very cheap.
In case 1, generating the result will cost more than filling with constant value in renamed architectures that have temporally long vectors, where the fill is done with renaming and no functional-unit occupancy. For the case of a renamed architecture with temporally short vectors (i.e. or more generally, where functional-unit occupancy is independent of vl), using the unmasked result could be very slightly beneficial, but the savings is only an OR gate on each bit of data path, which can be folded into mux structures at very low cost; the mux control is anyway needed to support tu. However, note for some implementations of floating-point, generating the correct result value might cause pipeline hiccups on subnormals or other special values.
Case 2 will generally cost additional logic - I struggle to see how this would ever be a better implementation choice than just OR-ing in the 1. If the source is not being read to copy into destination, then you need to somehow drive that input of the mux.
Case 3 needs a mux that wasn't there before in general.
As for possibly security covert channel in allowing different pattern fills, implementations can ensure that the pattern of fill values is not influenced by prior activity in a different context. That a thread migration interrupt was taken, is generally knowable from other timing sources, so a change in fill pattern does not really communicate new information.
W.r.t. #650, adding explicit tail-zero is a possible extension, but should be supported explicitly not incidentally if sufficiently useful.
Oops: I had not been auto logged into Github. So I wondered why the closed issue could not be reopened by me. So I expected that I couldn't just make the clarifying comment, and so thought I had to request it be opened.
However, it does provide opportunity to consider the etiquette relevant to github postings and relevance to our "inclusive policy". Which I would not have considered otherwise.
I appreciate Krste' diligence in addressing open issues.
However, I believe this issue is prematurely closed.
I have further comment that may be instructive to those trolling these archives for implementation or other direction.
Is it not also customary to allow the person opening an issue opportunity to voluntarily close it?
[good etiquette. If nothing else it gives an impression of (due) consideration to the person opening the issue.
It certainly gives the person opening the issue opportunity to clarify if they think the issue was not fully understood.
I am grateful that Krste considered my concern sufficiently clear and complete enough that no additional clarification is necessary (even possible).
On prior occasions he has wondered what I was talking about, so this implies some progress on my part. (I did try to make it clear and complete).
But still I think there are some nuances that are lost in my original verbiage.]
On 2021-03-12 3:20 a.m., Krste Asanovic wrote:
Closed #651 https://github.com/riscv/riscv-v-spec/issues/651.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/riscv/riscv-v-spec/issues/651#event-4449356732, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFAWIKOMOGWLTDJLUSMFQRTTDG6D7ANCNFSM4Y2QLYDQ.
I was indeed perhaps too hasty in closing this issue, so reopening for comment.
I believe there is a value in allowing masks to always write their whole result register ignoring vector length value, vl, as they are always tail-agnostic and this would eliminate the need for any control of write back values based on vector length in some implementations.
For data results where tail-undisturbed must be supported, there is little cost to PoR - the length control logic is anyway required and some mux implementations will support this without additional gates. We would like to reduce implementation diversity where possible, even though tail-agnostic already opens up the issue of implementation-specific behavior.
my primary observation is that in addition to undisturbed there are 3 other readily accessible values that are deterministic and verifiable although qualify as "agnostic".
Those being the result of the operation, and the two input values.
I believe we should consider each of these for more than just operational efficiency.
Part of that consideration is the plurality of possible values increases the specter of side channel risk.
We should at least mention the risk, that is already present, so that implementors will consider it in their design.
I suggested that specific criteria be met to avoid the likelihood and to raise awareness [a specific requirement is harder to avoid than an obscure warning].
Namely, once 1s are established as the fill, they remain the fill.
But regarding operational efficiency: It is not now known what an implementation may find efficient.
We can envision specific implementations to determine plausibility.
There is no mandate to use each of these 5 values for fill, thus a given option is deemed a poor fit for their implementation, it won't.
The verification is write once use everywhere.
The incremental cost is low when only 5 are checked,
- undisturbed [destination as source],
- either input value [scalar of vector],
- result of operation on the inputs and
- 1s.
Computed mask results added as option in https://github.com/riscv/riscv-v-spec/commit/fd4565e18d882d3ad2df8592be81034bafac1b71
If there's sufficient interest in expanding the set for data, this could be explored as compatible option post v1.0, so labeling this as post 1.0.