scylla-cluster-tests icon indicating copy to clipboard operation
scylla-cluster-tests copied to clipboard

Compress Scylla coredumps with something faster than gzip

Open michoecho opened this issue 11 months ago • 9 comments

Please compress the core files with zstd (pzstd) instead of gzip. They should have similar compression ratios, but zstd compresses and decompresses several times faster than zlib.

michoecho avatar Feb 28 '24 10:02 michoecho

Also see https://github.com/scylladb/qa-tasks/issues/1372

mykaul avatar Feb 29 '24 08:02 mykaul

And perhaps more importantly - https://github.com/scylladb/scylla-machine-image/issues/462

mykaul avatar Feb 29 '24 08:02 mykaul

we probably gonna defer to SMI to implement it, and then SCT to identify and send it as is

fruch avatar Mar 04 '24 15:03 fruch

we probably gonna defer to SMI to implement it, and then SCT to identify and send it as is

we're not always use SMI, possibly worth to fix it anyway.

soyacz avatar Mar 04 '24 17:03 soyacz

Example:

ykaul@ykaul:~/Downloads$ du -ch core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.gz 
443M	core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.gz
443M	total
ykaul@ykaul:~/Downloads$ pigz -d core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.gz 
ykaul@ykaul:~/Downloads$ du -ch core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 
58G	core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000
58G	total
ykaul@ykaul:~/Downloads$ zstd core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 
core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 :  0.25%   (  57.4 GiB =>    145 MiB, core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst) 
ykaul@ykaul:~/Downloads$ du -ch core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst 
145M	core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst
145M	total

It's not only faster, but requires substantially less storage.

mykaul avatar May 05 '24 10:05 mykaul

@mykaul you didn't show the time measurement. Note that I think with zstd, you need to pass an option (e.g., -T0) to make it use all the cores otherwise it uses just one (but please correct me if I'm wrong).

But the size saving is indeed impressive. My guess is that it is related to amazing and possibly uncommo compression ratio achieved in this case (400x!). I guess the ANS entropy coding beats the pants off the old Huffman coding used by gzip in this case. I'm not sure in every case the compression of our core files will be this impressive, or show such dramatic improvement of zstd over gzip.

nyh avatar May 05 '24 11:05 nyh

Right - I did not really look at compress/decompress time, for few reasons:

  1. I did use pigz to uncompress, could probably have used it for compression for a fair comparison between them.
  2. I don't care THAT much about times, I care more on how much I save time on transfer from the node, and more importantly, to the developer's laptop!
  3. The saving on Google Storage is also more important than time to compress.

Time to compress is indeed relevant if we care how long Scylla is down. I assume (from past experience) zstd is faster / as good as gzip. Since it writes hundreds of MBs less to the disk, I assume the overall is anyway faster.

mykaul avatar May 05 '24 11:05 mykaul

Odd, but OK:

ykaul@ykaul:~/Downloads$ time pigz -d core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.gz 

real	0m50.731s
user	0m17.107s
sys	0m38.388s
ykaul@ykaul:~/Downloads$ time zstd core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 
core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 :  0.25%   (  57.4 GiB =>    145 MiB, core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst) 

real	0m33.472s
user	0m26.812s
sys	0m16.672s
ykaul@ykaul:~/Downloads$ time zstd -T0 core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 
zstd: core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst already exists; overwrite (y/n) ? y
core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 :  0.25%   (  57.4 GiB =>    145 MiB, core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst) 

real	0m35.666s
user	0m28.647s
sys	0m16.491s

mykaul avatar May 05 '24 11:05 mykaul

And now just for completeness and I'm done:

ykaul@ykaul:~/Downloads$ time zstd --long core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 
zstd: core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst already exists; overwrite (y/n) ? y
core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 :  0.22%   (  57.4 GiB =>    129 MiB, core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst) 

real	1m24.424s
user	1m23.063s
sys	0m16.685s
ykaul@ykaul:~/Downloads$ time zstd -5  core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 
zstd: core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst already exists; overwrite (y/n) ? y
core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000 :  0.23%   (  57.4 GiB =>    138 MiB, core.scylla.112.2c636fd0f4064338ad074132ad90cdc3.6095.1714698097000000.zst) 

real	0m35.549s
user	0m30.594s
sys	0m15.575s

mykaul avatar May 05 '24 11:05 mykaul

I did use pigz to uncompress, could probably have used it for compression for a fair comparison between them.

Tangential note: gzip decompression can't be parallelized, the file format doesn't allow it. Every piece of output depends on all pieces of output before it. (So using pigz for decompression doesn't really speed it up).

zstd by default also produces a file with inter-block dependencies, which can't be decompressed in parallel.

But pzstd produces an output file which is split into independent blocks, and can be decompressed in parallel. That's why I mentioned it in the OP.

The kernel allows you to pass the core to an arbitrary compression command, so you could use pzstd for this. But systemd-coredump only has a fixed set of compression options. So I guess if we are stuck with systemd-coredump, then we can't make use of pzstd (or even zstd -T0)...

Time to compress is indeed relevant if we care how long Scylla is down. I assume (from past experience) zstd is faster / as good as gzip. Since it writes hundreds of MBs less to the disk, I assume the overall is anyway faster.

The consensus around the internet seems to be that zstd handily beats gzip in all performance aspects, and I see no reason to doubt it.

michoecho avatar May 06 '24 11:05 michoecho

this is gonna be handled by https://github.com/scylladb/scylla-machine-image/issues/462

we would have work to adapt, so we don't compress it again, but that's it, the rest would be done of of the box

fruch avatar May 27 '24 22:05 fruch