bri icon indicating copy to clipboard operation
bri copied to clipboard

Draft: Add htslib thread pool to build. Add partial index

Open cjw85 opened this issue 2 years ago • 0 comments

  • Add a thread pool to speed up bam decompression during index build. I've tested that the index is unaffected by this; though I don't understand htslib enough to know how the bgzf_tell() call is unaffected by this.
  • Add an --every=N option to bri index to build a partial index. The use case here is to be able to load N reads from a bam in a workflow parallelised into chunks of an unsorted (or name-sorted) BAM. I haven't considered how this works in the case of multiple alignment records for a query.

Benchmarking threads:

$ time ./bri index -i /scratch/cwright/test.unaligned.bam.bri  test.unaligned.bam

real	35m23.072s
user	34m37.240s
sys	0m44.307s

$ time ./bri index -t 3 -i /scratch/cwright/test.unaligned.bam.bri  test.unaligned.bam
[bri-build] writing to disk...
[bri-build] wrote index for 31582413 records.

real	13m58.457s
user	39m23.216s
sys	5m44.936s

cjw85 avatar Feb 27 '23 18:02 cjw85