highway icon indicating copy to clipboard operation
highway copied to clipboard

Optimized `Lt128` operator for RVV

Open lsrcz opened this issue 10 months ago • 1 comments

This pull request adds an optimized implementation of the Lt128 operator for RVV targets. The new implementation is synthesized using a program synthesizer.

The main computations use LMUL 1/8, which is usually more efficient than vector groups (LMUL > 1) and can outperform full vector registers (LMUL = 1) on some microarchitectures.

lsrcz avatar Apr 11 '24 03:04 lsrcz

The compilation result: https://lt128.godbolt.org/z/xEK6v4f6f.

lsrcz avatar Apr 11 '24 03:04 lsrcz