seqkit
seqkit copied to clipboard
Option to control gzip output compression level
Currently seqkit fixes the gzip output compression level to 5. Would be nice to provide an option to control gzip output compression level between 1-9
gzip
https://pkg.go.dev/github.com/klauspost/pgzip#NewWriter
func NewWriterLevel(w io.Writer, level int) (*Writer, error)
xz
https://pkg.go.dev/github.com/ulikunitz/xz seems not support.
zstd
https://pkg.go.dev/github.com/klauspost/compress/zstd#NewWriter
func NewWriter(w io.Writer, opts ...EOption) (*Encoder, error)
func WithEncoderLevel(l EncoderLevel) EOption
I'll leave it in another future version.
I just added a new global option --compress-level to set the compression level for gzip, zstd, and bzip2.
Compression level:
format range default comment
gzip 1-9 5 https://github.com/klauspost/pgzip sets 5 as the default value.
xz NA NA https://github.com/ulikunitz/xz does not support.
zstd 1-4 2 roughly equals to zstd 1, 3, 7, 11, respectively.
bzip 1-9 6 https://github.com/dsnet/compress
--compress-level int compression level for gzip, zstd, xz and bzip2. type "seqkit -h" for the range and default value for each format (default -1)
Tests
$ seq 1 9 | rush 'seqkit seq ../tests/hairpin.fa -o {}.gz --compress-level {}'
$ ls -l *.gz
-rw-r--r-- 1 shenwei shenwei 1409792 Mar 14 20:17 1.gz
-rw-r--r-- 1 shenwei shenwei 1354625 Mar 14 20:17 2.gz
-rw-r--r-- 1 shenwei shenwei 1349800 Mar 14 20:17 3.gz
-rw-r--r-- 1 shenwei shenwei 1275085 Mar 14 20:17 4.gz
-rw-r--r-- 1 shenwei shenwei 1261818 Mar 14 20:17 5.gz
-rw-r--r-- 1 shenwei shenwei 1234510 Mar 14 20:17 6.gz
-rw-r--r-- 1 shenwei shenwei 1216618 Mar 14 20:17 7.gz
-rw-r--r-- 1 shenwei shenwei 1186253 Mar 14 20:17 8.gz
-rw-r--r-- 1 shenwei shenwei 1190879 Mar 14 20:17 9.gz