BAMscale icon indicating copy to clipboard operation
BAMscale copied to clipboard

Feature request – add support for CSI (`*.csi`) BAM indices

Open jantusan opened this issue 5 months ago • 0 comments

Hello,

Thank you for developing BAMscale, it has become my go-to tool for generating bigwigs. While processing a large wheat ChIP-seq dataset I ran into a limitation that I hope could be addressed (or perhaps you already have a workaround):

Summary

When a BAM is indexed with CSI (needed for large chromosomes or many contigs), BAMscale fails with cannot find *.bai.

Steps to reproduce

samtools index -c sample.bam    # creates sample.bam.csi
BAMscale scale --bam sample.bam --binsize 10
# ERROR: cannot find sample.bam.bai

Expected behaviour

  • Automatically load sample.bam.csi, or
  • Allow specifying the index path (e.g. --index sample.bam.csi).

Feasibility notes (from HTSlib docs)

  • HTSlib loads BAI or CSI transparently via sam_index_load() after opening with hts_open/sam_open. See the sam.h API docs.
  • Explicit index path is supported in HTSlib ≥1.10 using the ##idx## syntax (e.g. sample.bam##idx##/path/to/sample.bam.csi) or via hts_idx_load2(fn, fnidx). See the 1.10 release notes and hts.h.
  • Background: BAI indexes are limited to chromosomes ≤512 Mbp, hence CSI for large genomes.

Environment

  • BAMscale v0.0.9
  • samtools 1.21

Reference

  • CSI spec: https://github.com/samtools/hts-specs/blob/master/CSIv1.pdf

Thanks for considering this.

jantusan avatar Aug 12 '25 15:08 jantusan