seqkit icon indicating copy to clipboard operation
seqkit copied to clipboard

Option to control gzip output compression level

Open hermidalc opened this issue 3 years ago • 2 comments

Currently seqkit fixes the gzip output compression level to 5. Would be nice to provide an option to control gzip output compression level between 1-9

hermidalc avatar Aug 18 '22 12:08 hermidalc

gzip

https://pkg.go.dev/github.com/klauspost/pgzip#NewWriter

func NewWriterLevel(w io.Writer, level int) (*Writer, error)

xz

https://pkg.go.dev/github.com/ulikunitz/xz seems not support.

zstd

https://pkg.go.dev/github.com/klauspost/compress/zstd#NewWriter

func NewWriter(w io.Writer, opts ...EOption) (*Encoder, error)
func WithEncoderLevel(l EncoderLevel) EOption

shenwei356 avatar Aug 18 '22 12:08 shenwei356

I'll leave it in another future version.

shenwei356 avatar Sep 22 '22 09:09 shenwei356

I just added a new global option --compress-level to set the compression level for gzip, zstd, and bzip2.

Compression level:
  format   range   default  comment
  gzip     1-9     5        https://github.com/klauspost/pgzip sets 5 as the default value.
  xz       NA      NA       https://github.com/ulikunitz/xz does not support.
  zstd     1-4     2        roughly equals to zstd 1, 3, 7, 11, respectively.
  bzip     1-9     6        https://github.com/dsnet/compress

--compress-level int              compression level for gzip, zstd, xz and bzip2. type "seqkit -h" for the range and default value for each format (default -1)

Tests

$ seq 1 9 | rush 'seqkit seq ../tests/hairpin.fa -o {}.gz --compress-level {}'
$ ls -l  *.gz
-rw-r--r-- 1 shenwei shenwei 1409792 Mar 14 20:17 1.gz
-rw-r--r-- 1 shenwei shenwei 1354625 Mar 14 20:17 2.gz
-rw-r--r-- 1 shenwei shenwei 1349800 Mar 14 20:17 3.gz
-rw-r--r-- 1 shenwei shenwei 1275085 Mar 14 20:17 4.gz
-rw-r--r-- 1 shenwei shenwei 1261818 Mar 14 20:17 5.gz
-rw-r--r-- 1 shenwei shenwei 1234510 Mar 14 20:17 6.gz
-rw-r--r-- 1 shenwei shenwei 1216618 Mar 14 20:17 7.gz
-rw-r--r-- 1 shenwei shenwei 1186253 Mar 14 20:17 8.gz
-rw-r--r-- 1 shenwei shenwei 1190879 Mar 14 20:17 9.gz

shenwei356 avatar Mar 14 '23 12:03 shenwei356