riscv-p-spec
riscv-p-spec copied to clipboard
Ternary instructions must die?
On crypto meeting there was a question about encodings (how to check if instruction is ternary).
If I got it correctly, ternary instructions are no more welcomed and probably those who are not ratified will not be.
CMIX
from bitmanip 0.9.3. zbt (and current zpn) is quiet good for SHA's Ch
and Maj
operations.
Can we say that CMIX and other ternaries most likely will not be ratified?
Probably @aswaterman knows something about it?
Also I hope that @ben-marshall will help me to tag correct people.
Whoa there.
I'm not familiar with the P spec, and lots of your questions relate to Bitmanip after a discussion of history in a crypto meeting. A better place for this discussion might be the isa-dev mailing list, given that it cuts across the entire architecture.
While in general I think RISC-V tries to avoid ternary ops if possible, they may be ratified in the future by some groups, be they P, a future B, or something else. In any case, the Crypto TG's decisions to not include any ternary instructions in it's ratification packages has no bearing on whether ternary instruction can or will ever exist in the architecture. Likewise, just because the first set of Bitmanip extensions didn't include ternary ops doesn't mean they won't one day ratify them.
About the encodings - if better encodings can be found for extensions than what's already proposed (including crypto), then that's great, but it's above my pay grade as to whether they can be changed at this late stage.
What Ben said.
Straightforward implementation of ternary ops adds substantial cost to some implementations, and so the bar is very high. My guess is they won’t be mandatory in RVA profiles for this reason, but that doesn’t mean they won’t ever be ratified. Their opcode space usage is a secondary concern (each R4 instruction takes up 1/4th of a minor opcode), but not a showstopping problem.
In the mean time, not having CMIX is OK! It only replaces 3 instructions (xor, and, xor). If doing many of these on a superscalar, the 3-instruction sequence is likely to be almost as fast, since multiple functional units will be capable of executing the simple ALU ops, whereas CMIX will likely not be available on multiple functional units because of the cost of the 3rd wakeup and regfile ports and bypass paths.
CMIX is here for the predicated execution without condition flags (as is the case in ARM). Especially that there is enough of 3R instructions and typical P enabled cores will be an: in order, single issue pipeline.
There is one proposal related to Zbt encodings: riscv/riscv-bitmanip/issues/112
Another (temporary) solution for Zbt is to move it entirely into OP-P