gzp icon indicating copy to clipboard operation
gzp copied to clipboard

Whether the performance can be strengthened

Open zundaren opened this issue 9 months ago • 5 comments

I am learning rust gzip, I found rust tar, gzip is very slow, so I found this library, but the current multithreaded version performance still feels a little weak. Before this I used the golang library

github.com/klauspost/pgzip, the library is very fast. The same 100M folder is packaged into tar.gz, pgzip only 300ms, but gzp time reached more than 1.2s, I want to know why there is such a big gap between the two languages, is there a big boss can help me to answer it.

zundaren avatar Apr 04 '25 13:04 zundaren

Hello! Thanks for making an issue, I very much consider performance regressions a bug.

Could you please provide your Rust code that is slow?

gzp should be as fast or faster than any other parallel compression / decompression libs out there. It's used in crabz, which has comparable to faster compression than pigz in c.

Things you should check:

  • Make sure the same compression level is used between implementations.
  • Make sure you're using buffered io
  • Make sure the same number of threads are available to each
  • Make sure you're compiling in release mode

sstadick avatar Apr 04 '25 14:04 sstadick


extern crate tar;

use std::fs::File;
use std::io::BufWriter;
use std::time::Instant;
use gzp::Compression;
use gzp::deflate::Mgzip;
use gzp::par::compress::{ParCompress, ParCompressBuilder};
use tar::Builder;

#[test]
pub fn main() {
    let start = Instant::now();

    let file = File::create(r"E:\download\test\x.tar.gz").unwrap();
    // let buf_writer = BufWriter::new(file);

    let mut gzp: ParCompress<Mgzip> = ParCompressBuilder::new()
        .compression_level(Compression::new(6))
        .from_writer(file);
        // .from_writer(buf_writer);


    let mut a = Builder::new(gzp);
    a.append_dir_all("tabby", r"E:\download\tabby-1.0.212-portable-x64").unwrap();

    println!("{:?}", start.elapsed());


// gzp = "1.0.1"                                                                                                              385M  889.1867ms
// gzp = { version = "1.0.1", default-features = false, features = ["deflate_rust"] }        385M  3.9197958s

}


Cargo.toml

tar = "0.4.44"
gzp = "1.0.1"

zundaren avatar Apr 04 '25 14:04 zundaren

Are you running that via cargo test? That will create a debug build and not a release build.

You should also use the BufWriter you have commented out.

Using the deflate_rust backend is fine, but the default would be faster if you don't mind it using c-libs under the hood.

sstadick avatar Apr 04 '25 15:04 sstadick

Are you running that via cargo test? That will create a debug build and not a release build.您是否通过 cargo test 运行它?这将创建一个调试版本,而不是发布版本。

You should also use the BufWriter you have commented out.您还应该使用已注释掉的 BufWriter。

Using the deflate_rust backend is fine, but the default would be faster if you don't mind it using c-libs under the hood.使用 deflate_rust 后端很好,但如果您不介意在后台使用 c-libs,默认会更快。

With release, the default setting +Compression::new(6) will still take about 1 second

zundaren avatar Apr 04 '25 15:04 zundaren

@zundaren can you also supply the equivalent Go code?

vrmiguel avatar Nov 11 '25 23:11 vrmiguel