zstd-rs icon indicating copy to clipboard operation
zstd-rs copied to clipboard

zstd-rs is slower than `tar -xf` for `.tar.zst`

Open konstin opened this issue 1 year ago • 2 comments

We found that zstd-rs is slower than tar -xf when unpacking .tar.zst files. This is in contrast to unpacking .tar.gz files with the flate2 and zlib-ng, where rust is faster than tar -xf

Benchmarks

Rust vs. tar -xf for zstd:

Benchmark 1: target/release/scratch-rust python.tar.zst
  Time (mean ± σ):     130.0 ms ±   2.5 ms    [User: 59.9 ms, System: 69.9 ms]
  Range (min … max):   125.7 ms … 135.5 ms    19 runs
 
Benchmark 2: tar -C unpacked -xf python.tar.zst
  Time (mean ± σ):      88.0 ms ±   3.9 ms    [User: 64.4 ms, System: 80.6 ms]
  Range (min … max):    82.6 ms …  97.5 ms    26 runs
 
Summary
  tar -C unpacked -xf python.tar.zst ran
    1.48 ± 0.07 times faster than target/release/scratch-rust python.tar.zst

Rust vs. tar -xf for gz:

Benchmark 1: target/release/scratch-rust cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz
  Time (mean ± σ):     193.7 ms ±   2.9 ms    [User: 124.3 ms, System: 68.5 ms]
  Range (min … max):   190.8 ms … 201.6 ms    13 runs
 
Benchmark 2: tar -C unpacked -xf cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz
  Time (mean ± σ):     303.3 ms ±   1.0 ms    [User: 283.9 ms, System: 70.7 ms]
  Range (min … max):   301.9 ms … 304.7 ms    10 runs
 
Summary
  target/release/scratch-rust cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz ran
    1.57 ± 0.02 times faster than tar -C unpacked -xf cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz

MRE

[package]
name = "scratch-rust"
version = "0.1.0"
edition = "2024"

[dependencies]
flate2 = { version = "1.1.0", features = ["zlib-ng"] }
tar = "0.4.44"
zstd = "0.13.3"
use flate2::read::GzDecoder;
use std::fs::File;
use std::io::BufReader;
use std::path::PathBuf;
use std::{env, fs};

fn main() {
    let file = PathBuf::from(env::args().skip(1).next().unwrap());
    match file.extension().unwrap().to_str().unwrap() {
        "gz" => {
            let reader = GzDecoder::new(BufReader::new(File::open(file).unwrap()));
            fs::create_dir_all("unpacked").unwrap();
            tar::Archive::new(reader).unpack("unpacked").unwrap();
        }
        "zst" => {
            let reader =
                zstd::Decoder::with_buffer(BufReader::new(File::open(file).unwrap())).unwrap();
            fs::create_dir_all("unpacked").unwrap();
            tar::Archive::new(reader).unpack("unpacked").unwrap();
        }
        unknown => panic!("Unknown file type: {}", unknown),
    }
}

Setup:

wget "https://github.com/astral-sh/python-build-standalone/releases/download/20250311/cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz"
zstd -c -d < "cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz" > "cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.zst"
rm -rf unpacked && mkdir unpacked && tar -C unpacked -xf cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz
tar --zstd -cf python.tar.zst -C unpacked .

We're repacking the archive instead of converting directly due to https://github.com/gyscos/zstd-rs/pull/251#issuecomment-2724534410

Benchmark:

cargo build --release

hyperfine --warmup 2 --prepare "rm -rf unpacked && mkdir unpacked" \
  "target/release/scratch-rust cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz" \
  "tar -C unpacked -xf cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz" \

hyperfine --warmup 2 --prepare "rm -rf unpacked && mkdir unpacked" \
  "target/release/scratch-rust python.tar.zst" \
  "tar -C unpacked -xf python.tar.zst"

konstin avatar Mar 14 '25 16:03 konstin

Hi, and thanks for the report!

In general, there may be overheads in the rust wrapper. Rust's Read/Write abstractions may involve some extra memcopies depending on the use-case. I'm open to optimizations to reduce these.

It's also possible that tar gives more specific parameters to the zstd stream. In particular, opening a file opens some possibilities that are not currently trivial with the rust wrapper. For example, decompressing the entire archive in-memory in a single buffer, so that the ZSTD_d_stableOutBuffer can be used, reducing the number of allocations and copies. Not sure what else could be done on the decompression side (the compression side has more knobs to turn).

gyscos avatar Mar 20 '25 15:03 gyscos

Do you happen to have recommendations what to try tweaking or where in the code to look? From a samply profile nothing stood out to me.

konstin avatar Jun 10 '25 16:06 konstin