swift-noise
swift-noise copied to clipboard
replacing tuples with SIMD - DONT MERGE - seems to be ~40+% SLOWER
Since we were talking about this, I took the time to set it up - but after all the conversions, it turns out thats only HURT performance (based on benchmark comparison).
swift package benchmark baseline compare bdb4ef08 --format markdown:
Comparing results between 'bdb4ef08' and 'Current_run'
Host 'Sparrow.local' with 8 'arm64' processors with 16 GB memory, running:
Darwin Kernel Version 23.5.0: Wed May 1 20:16:51 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T8103
ExternalBenchmarks
cell2d metrics
Time (wall clock): results within specified thresholds, fold down for details.
| Time (wall clock) (ns) * | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 458 | 583 | 583 | 584 | 625 | 625 | 38292 | 1048576 |
| Current_run | 708 | 792 | 792 | 833 | 834 | 875 | 54416 | 923477 |
| Δ | 250 | 209 | 209 | 249 | 209 | 250 | 16124 | -125099 |
| Improvement % | -55 | -36 | -36 | -43 | -33 | -40 | -42 | -125099 |
Throughput (# / s): results within specified thresholds, fold down for details.
| Throughput (# / s) (K) | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 2183 | 1716 | 1716 | 1713 | 1601 | 1601 | 26 | 1048576 |
| Current_run | 1412 | 1264 | 1264 | 1201 | 1199 | 1144 | 18 | 923477 |
| Δ | -771 | -452 | -452 | -512 | -402 | -457 | -8 | -125099 |
| Improvement % | -35 | -26 | -26 | -30 | -25 | -29 | -31 | -125099 |
cell3d metrics
Time (wall clock): results within specified thresholds, fold down for details.
| Time (wall clock) (ns) * | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 458 | 542 | 583 | 583 | 584 | 625 | 32667 | 1048576 |
| Current_run | 708 | 792 | 792 | 833 | 834 | 875 | 51083 | 922510 |
| Δ | 250 | 250 | 209 | 250 | 250 | 250 | 18416 | -126066 |
| Improvement % | -55 | -46 | -36 | -43 | -43 | -40 | -56 | -126066 |
Throughput (# / s): results within specified thresholds, fold down for details.
| Throughput (# / s) (K) | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 2183 | 1845 | 1716 | 1716 | 1713 | 1601 | 31 | 1048576 |
| Current_run | 1412 | 1264 | 1264 | 1201 | 1199 | 1144 | 20 | 922510 |
| Δ | -771 | -581 | -452 | -515 | -514 | -457 | -11 | -126066 |
| Improvement % | -35 | -31 | -26 | -30 | -30 | -29 | -35 | -126066 |
cell_tiling3d metrics
Time (wall clock): results within specified thresholds, fold down for details.
| Time (wall clock) (ns) * | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 458 | 583 | 583 | 584 | 625 | 666 | 66667 | 1048576 |
| Current_run | 708 | 792 | 792 | 833 | 834 | 875 | 51625 | 916598 |
| Δ | 250 | 209 | 209 | 249 | 209 | 209 | -15042 | -131978 |
| Improvement % | -55 | -36 | -36 | -43 | -33 | -31 | 23 | -131978 |
Throughput (# / s): results within specified thresholds, fold down for details.
| Throughput (# / s) (K) | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 2183 | 1716 | 1716 | 1713 | 1601 | 1502 | 15 | 1048576 |
| Current_run | 1412 | 1264 | 1264 | 1201 | 1199 | 1144 | 19 | 916598 |
| Δ | -771 | -452 | -452 | -512 | -402 | -358 | 4 | -131978 |
| Improvement % | -35 | -26 | -26 | -30 | -25 | -24 | 27 | -131978 |
classic3d metrics
Time (wall clock): results within specified thresholds, fold down for details.
| Time (wall clock) (μs) * | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 6750 | 6875 | 6919 | 7127 | 7211 | 7543 | 72959 | 138634 |
| Current_run | 9875 | 10047 | 10087 | 10087 | 10127 | 10295 | 60750 | 95852 |
| Δ | 3125 | 3172 | 3168 | 2960 | 2916 | 2752 | -12209 | -42782 |
| Improvement % | -46 | -46 | -46 | -42 | -40 | -36 | 17 | -42782 |
Throughput (# / s): results within specified thresholds, fold down for details.
| Throughput (# / s) (K) | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 148 | 146 | 145 | 140 | 139 | 133 | 14 | 138634 |
| Current_run | 101 | 100 | 99 | 99 | 99 | 97 | 16 | 95852 |
| Δ | -47 | -46 | -46 | -41 | -40 | -36 | 2 | -42782 |
| Improvement % | -32 | -32 | -32 | -29 | -29 | -27 | 14 | -42782 |
classic_tiling3d metrics
Time (wall clock): results within specified thresholds, fold down for details.
| Time (wall clock) (ns) * | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 708 | 792 | 833 | 833 | 834 | 916 | 35958 | 1009758 |
| Current_run | 1083 | 1208 | 1208 | 1209 | 1250 | 1292 | 47791 | 669209 |
| Δ | 375 | 416 | 375 | 376 | 416 | 376 | 11833 | -340549 |
| Improvement % | -53 | -53 | -45 | -45 | -50 | -41 | -33 | -340549 |
Throughput (# / s): results within specified thresholds, fold down for details.
| Throughput (# / s) (K) | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 1412 | 1264 | 1201 | 1201 | 1199 | 1093 | 28 | 1009758 |
| Current_run | 923 | 828 | 828 | 827 | 800 | 774 | 21 | 669209 |
| Δ | -489 | -436 | -373 | -374 | -399 | -319 | -7 | -340549 |
| Improvement % | -35 | -34 | -31 | -31 | -33 | -29 | -25 | -340549 |
classic_tiling_fbm3d metrics
Time (wall clock): results within specified thresholds, fold down for details.
| Time (wall clock) (μs) * | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 6833 | 6959 | 7003 | 7003 | 7087 | 7583 | 47125 | 138651 |
| Current_run | 9958 | 10127 | 10167 | 10167 | 10215 | 10335 | 57084 | 95148 |
| Δ | 3125 | 3168 | 3164 | 3164 | 3128 | 2752 | 9959 | -43503 |
| Improvement % | -46 | -46 | -45 | -45 | -44 | -36 | -21 | -43503 |
Throughput (# / s): results within specified thresholds, fold down for details.
| Throughput (# / s) (K) | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 146 | 144 | 143 | 143 | 141 | 132 | 21 | 138651 |
| Current_run | 100 | 99 | 98 | 98 | 98 | 97 | 18 | 95148 |
| Δ | -46 | -45 | -45 | -45 | -43 | -35 | -3 | -43503 |
| Improvement % | -32 | -31 | -31 | -31 | -30 | -27 | -14 | -43503 |
disk2d metrics
Time (wall clock): results within specified thresholds, fold down for details.
| Time (wall clock) (ms) * | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 9314 | 9339 | 9347 | 9388 | 9486 | 9609 | 9630 | 107 |
| Current_run | 24508 | 24576 | 24707 | 24969 | 30228 | 43058 | 43058 | 39 |
| Δ | 15194 | 15237 | 15360 | 15581 | 20742 | 33449 | 33428 | -68 |
| Improvement % | -163 | -163 | -164 | -166 | -219 | -348 | -347 | -68 |
Throughput (# / s): results within specified thresholds, fold down for details.
| Throughput (# / s) (#) | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 107 | 107 | 107 | 107 | 105 | 104 | 104 | 107 |
| Current_run | 41 | 41 | 40 | 40 | 33 | 23 | 23 | 39 |
| Δ | -66 | -66 | -67 | -67 | -72 | -81 | -81 | -68 |
| Improvement % | -62 | -62 | -63 | -63 | -69 | -78 | -78 | -68 |
gradient2d metrics
Time (wall clock): results within specified thresholds, fold down for details.
| Time (wall clock) (μs) * | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 6750 | 6875 | 6919 | 6959 | 7003 | 7503 | 42417 | 140146 |
| Current_run | 9916 | 10047 | 10087 | 10127 | 10167 | 10295 | 96958 | 95770 |
| Δ | 3166 | 3172 | 3168 | 3168 | 3164 | 2792 | 54541 | -44376 |
| Improvement % | -47 | -46 | -46 | -46 | -45 | -37 | -129 | -44376 |
Throughput (# / s): results within specified thresholds, fold down for details.
| Throughput (# / s) (K) | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 148 | 146 | 145 | 144 | 143 | 133 | 24 | 140146 |
| Current_run | 101 | 100 | 99 | 99 | 98 | 97 | 10 | 95770 |
| Δ | -47 | -46 | -46 | -45 | -45 | -36 | -14 | -44376 |
| Improvement % | -32 | -32 | -32 | -31 | -31 | -27 | -58 | -44376 |
gradient3d metrics
Time (wall clock): results within specified thresholds, fold down for details.
| Time (wall clock) (μs) * | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 6791 | 6919 | 6919 | 6959 | 7003 | 7503 | 44709 | 139935 |
| Current_run | 9916 | 10047 | 10087 | 10127 | 10167 | 10295 | 75500 | 95903 |
| Δ | 3125 | 3128 | 3168 | 3168 | 3164 | 2792 | 30791 | -44032 |
| Improvement % | -46 | -45 | -46 | -46 | -45 | -37 | -69 | -44032 |
Throughput (# / s): results within specified thresholds, fold down for details.
| Throughput (# / s) (K) | p0 | p25 | p50 | p75 | p90 | p99 | p100 | Samples |
|---|---|---|---|---|---|---|---|---|
| bdb4ef08 | 147 | 145 | 145 | 144 | 143 | 133 | 22 | 139935 |
| Current_run | 101 | 100 | 99 | 99 | 98 | 97 | 13 | 95903 |
| Δ | -46 | -45 | -46 | -45 | -45 | -36 | -9 | -44032 |
| Improvement % | -31 | -31 | -32 | -31 | -31 | -27 | -41 | -44032 |