lance
lance copied to clipboard
chore: simplify dot implementation to use auto-vectorization
This change makes the auto-vectorization version of dot(f32) as fast as manually written SIMD.
Run benchmarks via
export RUSTFLAGS="-C target-cpu=native"
git checkout main
cargo bench --bench dot -- --save-baseline dot_main f32
git checkout lei/simplify_dot
cargo bench --bench dot -- --baseline dot_main f32
On Macbook M2 Max
Dot(f32, auto-vectorization)
time: [88.812 ms 89.654 ms 90.306 ms]
change: [-2.5819% -1.6876% -0.6964%] (p = 0.01 < 0.10)
Change within noise threshold.
AMD 5900X
Dot(f32, auto-vectorization)
time: [172.50 ms 176.41 ms 179.41 ms]
change: [-2.3545% +0.6133% +3.5448%] (p = 0.69 > 0.10)
No change in performance detected.
Intel Sapphire
Dot(f32, auto-vectorization)
time: [331.36 ms 331.62 ms 331.93 ms]
change: [-2.3160% -1.1226% -0.3451%] (p = 0.04 < 0.10)
Change within noise threshold.
Graviton3
Benchmarking Dot(f32, auto-vectorization): Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 8.8s or enable flat sampling.
Dot(f32, auto-vectorization)
time: [160.62 ms 160.70 ms 160.76 ms]
change: [-1.1157% -0.6868% -0.2951%] (p = 0.00 < 0.10)
Change within noise threshold.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) low mild
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 79.73%. Comparing base (
3edfa50) to head (62228e5).
Additional details and impacted files
@@ Coverage Diff @@
## main #2645 +/- ##
==========================================
- Coverage 79.81% 79.73% -0.08%
==========================================
Files 224 224
Lines 65871 65827 -44
Branches 65871 65827 -44
==========================================
- Hits 52572 52489 -83
- Misses 10225 10256 +31
- Partials 3074 3082 +8
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 79.73% <100.00%> (-0.08%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.