zstd-rs is slower than `tar -xf` for `.tar.zst`
We found that zstd-rs is slower than tar -xf when unpacking .tar.zst files. This is in contrast to unpacking .tar.gz files with the flate2 and zlib-ng, where rust is faster than tar -xf
Benchmarks
Rust vs. tar -xf for zstd:
Benchmark 1: target/release/scratch-rust python.tar.zst
Time (mean ± σ): 130.0 ms ± 2.5 ms [User: 59.9 ms, System: 69.9 ms]
Range (min … max): 125.7 ms … 135.5 ms 19 runs
Benchmark 2: tar -C unpacked -xf python.tar.zst
Time (mean ± σ): 88.0 ms ± 3.9 ms [User: 64.4 ms, System: 80.6 ms]
Range (min … max): 82.6 ms … 97.5 ms 26 runs
Summary
tar -C unpacked -xf python.tar.zst ran
1.48 ± 0.07 times faster than target/release/scratch-rust python.tar.zst
Rust vs. tar -xf for gz:
Benchmark 1: target/release/scratch-rust cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz
Time (mean ± σ): 193.7 ms ± 2.9 ms [User: 124.3 ms, System: 68.5 ms]
Range (min … max): 190.8 ms … 201.6 ms 13 runs
Benchmark 2: tar -C unpacked -xf cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz
Time (mean ± σ): 303.3 ms ± 1.0 ms [User: 283.9 ms, System: 70.7 ms]
Range (min … max): 301.9 ms … 304.7 ms 10 runs
Summary
target/release/scratch-rust cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz ran
1.57 ± 0.02 times faster than tar -C unpacked -xf cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz
MRE
[package]
name = "scratch-rust"
version = "0.1.0"
edition = "2024"
[dependencies]
flate2 = { version = "1.1.0", features = ["zlib-ng"] }
tar = "0.4.44"
zstd = "0.13.3"
use flate2::read::GzDecoder;
use std::fs::File;
use std::io::BufReader;
use std::path::PathBuf;
use std::{env, fs};
fn main() {
let file = PathBuf::from(env::args().skip(1).next().unwrap());
match file.extension().unwrap().to_str().unwrap() {
"gz" => {
let reader = GzDecoder::new(BufReader::new(File::open(file).unwrap()));
fs::create_dir_all("unpacked").unwrap();
tar::Archive::new(reader).unpack("unpacked").unwrap();
}
"zst" => {
let reader =
zstd::Decoder::with_buffer(BufReader::new(File::open(file).unwrap())).unwrap();
fs::create_dir_all("unpacked").unwrap();
tar::Archive::new(reader).unpack("unpacked").unwrap();
}
unknown => panic!("Unknown file type: {}", unknown),
}
}
Setup:
wget "https://github.com/astral-sh/python-build-standalone/releases/download/20250311/cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz"
zstd -c -d < "cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz" > "cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.zst"
rm -rf unpacked && mkdir unpacked && tar -C unpacked -xf cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz
tar --zstd -cf python.tar.zst -C unpacked .
We're repacking the archive instead of converting directly due to https://github.com/gyscos/zstd-rs/pull/251#issuecomment-2724534410
Benchmark:
cargo build --release
hyperfine --warmup 2 --prepare "rm -rf unpacked && mkdir unpacked" \
"target/release/scratch-rust cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz" \
"tar -C unpacked -xf cpython-3.12.9+20250311-aarch64-unknown-linux-gnu-install_only.tar.gz" \
hyperfine --warmup 2 --prepare "rm -rf unpacked && mkdir unpacked" \
"target/release/scratch-rust python.tar.zst" \
"tar -C unpacked -xf python.tar.zst"
Hi, and thanks for the report!
In general, there may be overheads in the rust wrapper. Rust's Read/Write abstractions may involve some extra memcopies depending on the use-case. I'm open to optimizations to reduce these.
It's also possible that tar gives more specific parameters to the zstd stream. In particular, opening a file opens some possibilities that are not currently trivial with the rust wrapper. For example, decompressing the entire archive in-memory in a single buffer, so that the ZSTD_d_stableOutBuffer can be used, reducing the number of allocations and copies. Not sure what else could be done on the decompression side (the compression side has more knobs to turn).
Do you happen to have recommendations what to try tweaking or where in the code to look? From a samply profile nothing stood out to me.