zp7 Optimize carry bit computation for parallel prefix XOR

Optimize carry bit computation for parallel prefix XOR

Open yanjiew1 opened this issue 6 months ago • 0 comments

Updated the formula for computing carry bits from (input & (result << 1)) to (input & ~result).

This new approach utilizes the ANDN instruction, reducing the computation to a single instruction.

Aug 14 '24 05:08 yanjiew1