jiter
jiter copied to clipboard
add simd string quote seeking
Codecov Report
Merging #52 (84087c9) into main (b6a645c) will decrease coverage by
4.73%. Report is 1 commits behind head on main. The diff coverage is62.72%.
Additional details and impacted files
@@ Coverage Diff @@
## main #52 +/- ##
==========================================
- Coverage 95.12% 90.40% -4.73%
==========================================
Files 8 8
Lines 1067 1198 +131
==========================================
+ Hits 1015 1083 +68
- Misses 52 115 +63
| Files | Coverage Δ | |
|---|---|---|
| src/string_decoder.rs | 71.20% <62.72%> (-21.24%) |
:arrow_down: |
Continue to review full report in Codecov by Sentry.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 71df64a...84087c9. Read the comment docs.
CodSpeed Performance Report
Merging #52 will degrade performances by 10.04%
Comparing dh/simd-string (84087c9) with main (71df64a)
Summary
❌ 1 regressions
✅ 41 untouched benchmarks
:warning: Please fix the performance issues or acknowledge them on CodSpeed.
Benchmarks breakdown
| Benchmark | main |
dh/simd-string |
Change | |
|---|---|---|---|---|
| ❌ | string_array_jiter_iter |
44.9 µs | 49.9 µs | -10.04% |
I could reproduce locally that SSE simd was slower than whatever the compiler was generating. On aarch64 the simd intrinsics are much newer and I did reproduce a speedup of some 25%; this may also be that the aarch64-apple-darwin target is not as optimised as the x86_64-linux-gnu one which makes it easier to beat the compiler.
We could look at AVX simd but that can't run under Rosetta on macOS, so I'll have to look next week. It's unclear if GitHub Actions supports AVX.
I've implemented AVX simd for x86_64.
It's not clear-cut; for short strings the non-simd version wins. On aarch64 (neon) on my M1 the breakeven is two-character strings, for x86_64 on my desktop the breakeven is four-character strings.
The address sanitizer doesn't like the buffer overflow. This isn't a huge surprise, in practice it shouldn't be a problem but it is a potential source of danger after refactoring, so it should probably be reworked.
Overall what I've got here is a bit of a mess. Given it's not a guaranteed perf win, especially in pydantic where we might expect a lot of short strings for enum values (?), I'm not particularly in love with this branch. I'll leave it here as draft, to be decided later if we refactor or bin it.
closing as this was superceded in aarch64 with #65 and x86 will need to start gain.