zstd
zstd copied to clipboard
Surprising behavior on some data set
I ran some zstd benchmarks on various datasets. At my company we're using zstd for many applications which don't have the same requirements in terms of cpu, memory, speed and ratio.
Each plot shows the results of benchmarks run with zstd -q -b3 -e14, zstd -q -b3 -e14 --long=27 and zstd -q -b3 -e14 --long=29.
Warning: (in these pictures, ratio is original/compressed, so bigger numbers means more compression, as expressed in zstd benchmark mode)
16 over 18 dataset show expected plots, while 2 of them show, at least for me, a surprising behavior:
This is how zstd behaves for most of the datasets:

But here are the 2 data sets with unexpected results:


NOTE: all benchmarks were run on zstd v1.4.4 on linux/x86-64 architecture
Since then, I re-ran them multiple times in the original conditions and obtained similar results. I also reproduced that on another machine (same architecture) with version v1.4.3.
Here it is:

Is this a bug? is it expected somewhat?
`api` raw data
title=api lines
plot=--long=no
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-3 26960545 (46.895) 1166.39 MB/s 3773.7 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-4 27029704 (46.775) 1181.65 MB/s 3779.3 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-5 25575654 (49.434) 266.62 MB/s 3863.2 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-6 24745208 (51.094) 257.14 MB/s 4023.8 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-7 23235681 (54.413) 207.01 MB/s 4490.1 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-8 22522655 (56.135) 183.19 MB/s 4691.2 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-9 22244120 (56.838) 144.06 MB/s 4572.9 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-10 22394956 (56.456) 120.05 MB/s 4506.5 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-11 22326886 (56.628) 106.40 MB/s 4421.1 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-12 22102242 (57.203) 87.16 MB/s 4421.3 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-13 21859954 (57.837) 63.23 MB/s 4389.3 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-14 21721543 (58.206) 52.89 MB/s 4411.4 MB/s api.log.multi
plot=--long=27
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-3 26120036 (48.404) 116.30 MB/s 3520.0 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-4 27565578 (45.866) 113.13 MB/s 3095.5 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-5 27691468 (45.657) 102.20 MB/s 2928.2 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-6 27258039 (46.383) 101.79 MB/s 3005.4 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-7 26411661 (47.870) 98.29 MB/s 3142.9 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-8 26115750 (48.412) 95.04 MB/s 3223.2 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-9 26134271 (48.378) 87.40 MB/s 3071.9 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-10 26186735 (48.281) 80.54 MB/s 3076.4 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-11 26195783 (48.264) 73.10 MB/s 2987.6 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-12 26036124 (48.560) 70.31 MB/s 3050.8 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-13 26150741 (48.347) 55.77 MB/s 2977.8 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-14 26123748 (48.397) 50.61 MB/s 3023.1 MB/s api.log.multi
plot=--long=29
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-3 26690138 (47.370) 118.34 MB/s 3489.1 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-4 28143404 (44.924) 114.54 MB/s 3060.1 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-5 28127385 (44.950) 105.30 MB/s 2859.2 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-6 27667942 (45.696) 104.77 MB/s 2927.1 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-7 26864003 (47.064) 101.09 MB/s 3097.9 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-8 26603721 (47.524) 98.19 MB/s 3115.4 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-9 26609787 (47.513) 89.50 MB/s 2997.4 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-10 26701938 (47.349) 82.92 MB/s 2941.6 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-11 26708182 (47.338) 76.06 MB/s 2835.2 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-12 26520902 (47.673) 72.57 MB/s 2913.3 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-13 26614552 (47.505) 55.92 MB/s 2851.1 MB/s api.log.multi
bench 1.4.4 : input 1264319464 bytes, 3 seconds, 0 KB blocks
-14 26621317 (47.493) 50.50 MB/s 2889.3 MB/s api.log.multi
`dat` raw data
title=dat lines
plot=--long=no
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-3 56281414 (50.477) 1198.28 MB/s 3818.1 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-4 56612769 (50.182) 1195.63 MB/s 3771.4 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-5 54181245 (52.434) 262.53 MB/s 4150.7 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-6 52702676 (53.905) 248.03 MB/s 4222.0 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-7 50863166 (55.855) 213.50 MB/s 4476.6 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-8 49629523 (57.243) 196.81 MB/s 4601.0 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-9 48835902 (58.173) 155.05 MB/s 4599.8 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-10 48248656 (58.881) 129.35 MB/s 4417.5 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-11 48080924 (59.087) 111.17 MB/s 4377.7 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-12 47315330 (60.043) 94.42 MB/s 4405.7 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-13 46865871 (60.619) 61.70 MB/s 4434.8 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-14 46555855 (61.022) 53.15 MB/s 4416.4 MB/s dat.log.multi
plot=--long=27
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-3 58345691 (48.692) 162.99 MB/s 3602.6 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-4 62274245 (45.620) 156.69 MB/s 3193.0 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-5 62374461 (45.547) 139.49 MB/s 3106.9 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-6 61264457 (46.372) 135.01 MB/s 3135.7 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-7 60095917 (47.273) 129.01 MB/s 3268.2 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-8 59649707 (47.627) 122.90 MB/s 3262.3 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-9 59863260 (47.457) 111.73 MB/s 3199.6 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-10 59838214 (47.477) 101.03 MB/s 3153.4 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-11 59826938 (47.486) 92.46 MB/s 3113.7 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-12 59530570 (47.722) 88.07 MB/s 3099.6 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-13 59664469 (47.615) 73.21 MB/s 3102.1 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-14 59547275 (47.709) 66.71 MB/s 3092.8 MB/s dat.log.multi
plot=--long=29
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-3 58398102 (48.648) 155.60 MB/s 3608.6 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-4 62527809 (45.435) 149.06 MB/s 3116.7 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-5 62853791 (45.199) 133.58 MB/s 2992.4 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-6 61719684 (46.030) 131.33 MB/s 3018.1 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-7 60507410 (46.952) 124.13 MB/s 3159.5 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-8 60036133 (47.321) 117.89 MB/s 3182.8 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-9 60395106 (47.039) 106.37 MB/s 3050.7 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-10 60452469 (46.995) 97.63 MB/s 2967.8 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-11 60433215 (47.010) 89.39 MB/s 2925.0 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-12 60143641 (47.236) 83.47 MB/s 2925.1 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-13 60209403 (47.184) 70.54 MB/s 2938.4 MB/s dat.log.multi
Not enough memory; testing 2709 MB only...
bench 1.4.4 : input 2840941909 bytes, 3 seconds, 0 KB blocks
-14 60045117 (47.313) 63.81 MB/s 2920.5 MB/s dat.log.multi