zstd icon indicating copy to clipboard operation
zstd copied to clipboard

dictionary from unstructure log data can not speed up compress processing?

Open qshuai opened this issue 2 years ago • 5 comments

zstd version: *** Zstandard CLI (32-bit) v1.5.4, by Yann Collet ***

My steps:

  1. train from the original log file (20MB/file, 45 log files)
$ zstd -B4096 --train -r train
Trying 5 different sets of parameters
k=1024
d=8
f=20
steps=4
split=75
accel=1
Save dictionary of size 112640 into file dictionary
  1. compress the original log file(used by trainning) and add contrast group without dictionary
time /tmp/zstd -k -1 -f -D dictionary -r train
 45 files compressed : 9.28% (   897 MiB =>   83.3 MiB)                        ==>  9%
real	1m 12.22s
user	0m 30.25s
sys	0m 6.19s

time /tmp/zstd -k -1 -f -r train
 45 files compressed : 9.27% (   897 MiB =>   83.2 MiB)                        ==>  9%
real	1m 5.28s
user	0m 29.67s
sys	0m 6.18s

My trainning dictionary has no effect on compress ratio and compress speed. Thanks for your help!

In addition:

  1. I train the dictionary in my macbookpro M1 (16 GB memory)
  2. The compress testing run in linux with poor performance(armv7, 32-bit, 750 MB memory)
  3. The log content like this(as the following screenshot) image

qshuai avatar Apr 08 '23 08:04 qshuai