simdjson-java icon indicating copy to clipboard operation
simdjson-java copied to clipboard

selectFrom and rearrange (vectorized lookup tables) are going to be slow

Open lemire opened this issue 1 year ago • 2 comments

When parsing strings with SIMD instructions, vectorized table lookup like vpshufb (x64) or vtbl (NEON) are critically important. They are very cheap (often run in 1 cycle) and very powerful. And indeed, the simdjson-java library makes extensive use of rearrange and selectFrom (part of the Java Vector API). At a glance, it may appear that rearrange and selectFrom are just wrapper around the fast underlying instructions (e.g., vpshufb or vtbl). But they are not. They generate a long flow of instructions. So it is unlike C#/dotnet where you have Ssse3.Shuffle or AdvSimd.Arm64.VectorTableLookup for example.

Thus, I am afraid that it its current state, Java Vector might be "performance challenged". It is simply too high level.

I have reported the design issue. See https://mail.openjdk.org/pipermail/panama-dev/2024-June/020476.html

lemire avatar Jun 18 '24 21:06 lemire

@lemire thank you for your review and sharing this information here and on the panama mailing list. I hope the conversion is fruitful. Especially, John Rose's answer let me hope for an API. Should we take up the matter again on the mailing list?

arouel avatar Oct 21 '24 19:10 arouel

@arouel I definitively think that it is worth following up... especially as new Java releases come around.

lemire avatar Oct 21 '24 19:10 lemire