minimap2 icon indicating copy to clipboard operation
minimap2 copied to clipboard

Feature Request: Compress Index When Saving

Open schorlton opened this issue 1 year ago • 3 comments

Thank you for the amazing tool.

Feature request: optional compress index when saving it to disk? My evaluation suggests that running gzip on the .mmi index can compress it by up to 50%. The .mmi is already multiple times larger than it's original FASTA file. When running on really large FASTA files, the index can be massive. As an extreme example, indexing NCBI nt results in an index of 1.6 TB.

Would appreciate if minimap2 could implement simple compression when reading/writing the .mmi index to disk. Something like:

minimap2 -ax map-ont -d index.mmi.gz --gzip-index nt.fna
minimap2 --split-prefix temp index.mmi.gz reads.fastq

Thanks for your consideration!

schorlton avatar Aug 14 '22 18:08 schorlton

You could simply pipe the output through gzip

godofdream avatar Sep 08 '22 21:09 godofdream

You could simply pipe the output through gzip

How? The index save location is specified with an arg, and utilized during mapping with a positional arg to minimap2.

schorlton avatar Sep 08 '22 21:09 schorlton

You can use a little trick with /dev/stdout: minimap2 -x map-ont -d /dev/stdout seqs.fa | gzip >seqs.mmi.gz

W-L avatar Nov 29 '22 10:11 W-L