vg icon indicating copy to clipboard operation
vg copied to clipboard

Autoindex should parse tabix-indexed monolithic VCFs in parallel

Open jeizenga opened this issue 3 months ago • 0 comments

We've had a few users complain about autoindex's excessively slow chunking process for VCFs when they are provided as a single file for all chromosomes (e.g. https://github.com/vgteam/vg/issues/4274). This results from a single-threaded linear scan over the VCF to parcel it out to chunks that subsequently run in parallel. If the VCF is tabix-indexed, it should be possible to chunk the VCF in parallel across chromosomes, which would alleviate this issue.

jeizenga avatar Apr 25 '24 18:04 jeizenga