HMMRATAC icon indicating copy to clipboard operation
HMMRATAC copied to clipboard

Using HMMRATAC with CSI indexes and large chromosome sizes

Open diego-rt opened this issue 2 years ago • 2 comments

I'm working with an organism with a very large genome and thus I require the use of a CSI index. However, when I pass HMMRATAC my csi index, it complains about it having an invalid file header. I've tried renaming it to .bam.bai but this has not helped.

Do you know of any workarounds or plans to support csi indexes? I believe htsjdk supports csi indexes but only from version 2.19.0. I wonder whether this might be the problem?

Thanks a lot!

The exception:

Exception in thread "main" java.lang.RuntimeException: Invalid file header in BAM index 136160.unique.sorted.dedup.bam.bai: ?
	at net.sf.samtools.AbstractBAMFileIndex.<init>(AbstractBAMFileIndex.java:90)
	at net.sf.samtools.DiskBasedBAMFileIndex.<init>(DiskBasedBAMFileIndex.java:46)
	at net.sf.samtools.BAMFileReader.getIndex(BAMFileReader.java:232)
	at net.sf.samtools.BAMFileReader.createIndexIterator(BAMFileReader.java:592)
	at net.sf.samtools.BAMFileReader.query(BAMFileReader.java:352)
	at net.sf.samtools.SAMFileReader.query(SAMFileReader.java:363)
	at HMMR_ATAC.pullLargeLengths.read(pullLargeLengths.java:112)
	at HMMR_ATAC.pullLargeLengths.<init>(pullLargeLengths.java:61)
	at HMMR_ATAC.Main_HMMR_Driver.main(Main_HMMR_Driver.java:219)

diego-rt avatar Aug 29 '21 19:08 diego-rt

Hello, I was wondering whether you have any updates, workarounds or suggestions on how to address this? @taoliu @EvanTarbell

Just for clarify, the problem is that when working with chromosome sizes larger than 512 Mbp, one needs to use a CSI index (i.e. using samtools index -c : https://www.htslib.org/doc/samtools-index.html ) as opposed to a BAI index

Thanks a lot!

diego-rt avatar Dec 21 '21 11:12 diego-rt

I think the comments on #96 can help you find a solution.

jitsedesmet avatar May 27 '22 09:05 jitsedesmet