jiter icon indicating copy to clipboard operation
jiter copied to clipboard

add simd string quote seeking

Open davidhewitt opened this issue 1 year ago • 4 comments

davidhewitt avatar Dec 05 '23 15:12 davidhewitt

Codecov Report

Merging #52 (84087c9) into main (b6a645c) will decrease coverage by 4.73%. Report is 1 commits behind head on main. The diff coverage is 62.72%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #52      +/-   ##
==========================================
- Coverage   95.12%   90.40%   -4.73%     
==========================================
  Files           8        8              
  Lines        1067     1198     +131     
==========================================
+ Hits         1015     1083      +68     
- Misses         52      115      +63     
Files Coverage Δ
src/string_decoder.rs 71.20% <62.72%> (-21.24%) :arrow_down:

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 71df64a...84087c9. Read the comment docs.

codecov[bot] avatar Dec 05 '23 20:12 codecov[bot]

CodSpeed Performance Report

Merging #52 will degrade performances by 10.04%

Comparing dh/simd-string (84087c9) with main (71df64a)

Summary

❌ 1 regressions ✅ 41 untouched benchmarks

:warning: Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main dh/simd-string Change
string_array_jiter_iter 44.9 µs 49.9 µs -10.04%

codspeed-hq[bot] avatar Dec 05 '23 20:12 codspeed-hq[bot]

I could reproduce locally that SSE simd was slower than whatever the compiler was generating. On aarch64 the simd intrinsics are much newer and I did reproduce a speedup of some 25%; this may also be that the aarch64-apple-darwin target is not as optimised as the x86_64-linux-gnu one which makes it easier to beat the compiler.

We could look at AVX simd but that can't run under Rosetta on macOS, so I'll have to look next week. It's unclear if GitHub Actions supports AVX.

davidhewitt avatar Dec 05 '23 21:12 davidhewitt

I've implemented AVX simd for x86_64.

It's not clear-cut; for short strings the non-simd version wins. On aarch64 (neon) on my M1 the breakeven is two-character strings, for x86_64 on my desktop the breakeven is four-character strings.

The address sanitizer doesn't like the buffer overflow. This isn't a huge surprise, in practice it shouldn't be a problem but it is a potential source of danger after refactoring, so it should probably be reworked.

Overall what I've got here is a bit of a mess. Given it's not a guaranteed perf win, especially in pydantic where we might expect a lot of short strings for enum values (?), I'm not particularly in love with this branch. I'll leave it here as draft, to be decided later if we refactor or bin it.

davidhewitt avatar Dec 12 '23 09:12 davidhewitt

closing as this was superceded in aarch64 with #65 and x86 will need to start gain.

samuelcolvin avatar Aug 08 '24 16:08 samuelcolvin