base64
base64 copied to clipboard
Benchmarks
I did some automated benchmarking on my i7-10700 and Edison (Merrifield dual core Silvermont Atom without cache memory, similar to Baytrail) that I want to share here. Strictly, this issue is for reference only. It might be useful to find those commits causing substantial performance increases or decreases. All data have been taken without OpenMP (1 thread only) and in x86_64 mode. On i7 you will see some deviation probably caused by frequency scaling / turbo boost. Don't let that disturb you. Data can be found here if you want to play yourself benchmarks.ods
Below I filter out the most interesting commits.
Encoding
Note that on Edison SSE3 encoding took a hit with 9a0d1b2.
# | Hash | Commit message |
---|---|---|
24 | 3f3f31c | Fix build under Xcode |
30 | 67ee3fd | SSSE3->AVX2 encoding optimization |
76 | a5b6739 | SSSE3: enc: factor encoding loop into inline function |
79 | 99977db | Generic64: enc: factor encoding loop into inline function |
92 | e2c6687 | AVX2: enc: unroll inner loop |
93 | 9a0d1b2 | SSSE3: enc: unroll inner loop |
96 | bf7341f | Generic64: enc: unroll inner loop |
114 | b8b3c58 | Generic64: enc: use 12-bit lookup table |
Decoding
Especially for Edison it has been a bumpy ride, with great improvements 3f3f31c and regressions 0a69845 on SSE3 but also for PLAIN cfa8bf7 and f538baa.
# | Hash | Commit message |
---|---|---|
24 | 3f3f31c | Fix build under Xcode |
29 | cfa8bf7 | Plain decoding optimization |
35 | 0a69845 | SSSE3->AVX2, NEON32 decoding optimization |
85 | 6310c1f | SSSE3: dec: factor decoding loop into inline function |
88 | f538baa | Generic32: dec: factor decoding loop into inline function |
100 | 495414b | AVX2: dec: unroll inner loop |
101 | 5874921 | SSSE3: dec: unroll inner loop |