bamtools icon indicating copy to clipboard operation
bamtools copied to clipboard

RefLength of BamMultiReader does not appear to be correct

Open SebastianHollizeck opened this issue 5 years ago • 1 comments

I started working through a dormant c++ project using bamtools and in this cases specifically the BamMultiReader and from all i could gather, the RefLength is not correct.

I have a small fasta using as a reference and some bam, which has test reads aligned to it.

The fasta has 900 nucleotides, but the reported RefLength of the bam reader is 240.

I attached the files as well as the code part, that is in question here.

` RefVector refs = reader.GetReferenceData();

for(size_t i = 0; i < refs.size() ; i++) {
	string chrom = refs.at(i).RefName;
	int refId = reader.GetReferenceID(refs.at(i).RefName);
	int StartPosition = 1;
	int StopPosition = refs.at(i).RefLength;  // Scan til end of chromosome
            
	cout << "Running multiSNV on chromosome " <<  chrom << " " << StartPosition << ": " << StopPosition << endl;
	cout << endl;
	BamRegion region(refId, StartPosition, refId, StopPosition);

	try {
		run_multisnv_on_region(region, chrom, reader,
				filter_settings, gibbs_settings, program_settings, R);
	}

	catch(runtime_error& e) {
		throw e;
	}
}

` If you look into the .fai index, you will see the fasta has length 900

Is that intended behaviour?

Cheers, Sebastian

small.zip

SebastianHollizeck avatar Mar 13 '19 07:03 SebastianHollizeck

Are you sure that bam file belongs to that fasta file? I mean during alignment you used the same fasta file? because this tool reads header of bam/sam file to get the length of each contig/gene and this information of header is obtained by fasta file during alignment.

I think you should recheck your fasta file you used in alignment once.

Regards, Madiha

ghost avatar Apr 18 '19 12:04 ghost