riscv-crypto
riscv-crypto copied to clipboard
any idea about the hardware implementation of vgmul instruction?
As specified in the adocs, the vgmul
instruction requires a very complex implementation:
for (int bit = 0; bit < 128; bit++) {
if bit_to_bool(Y[bit])
Z ^= H
bool reduce = bit_to_bool(H[127]);
H = H << 1; // left shift H by 1
if (reduce)
H ^= 0x87; // Reduce using x^7 + x^2 + x^1 + 1 polynomial
}
If we utilize pipeline, there would be around 128 stages, however, it is not efficient at all.
Any suggestion about the implementation of it? Thx.