bri
bri copied to clipboard
Draft: Add htslib thread pool to build. Add partial index
- Add a thread pool to speed up bam decompression during index build. I've tested that the index is unaffected by this; though I don't understand htslib enough to know how the
bgzf_tell()call is unaffected by this. - Add an
--every=Noption tobri indexto build a partial index. The use case here is to be able to load N reads from a bam in a workflow parallelised into chunks of an unsorted (or name-sorted) BAM. I haven't considered how this works in the case of multiple alignment records for a query.
Benchmarking threads:
$ time ./bri index -i /scratch/cwright/test.unaligned.bam.bri test.unaligned.bam
real 35m23.072s
user 34m37.240s
sys 0m44.307s
$ time ./bri index -t 3 -i /scratch/cwright/test.unaligned.bam.bri test.unaligned.bam
[bri-build] writing to disk...
[bri-build] wrote index for 31582413 records.
real 13m58.457s
user 39m23.216s
sys 5m44.936s