bamtools icon indicating copy to clipboard operation
bamtools copied to clipboard

pileup: Count reads supporting each base at each position

Open sjackman opened this issue 12 years ago • 5 comments

I have a feature request. I'd like to see a pileup command that reports the number of reads that support each base at each position. I'm not interested in indels.

Thanks, Shaun

sjackman avatar Jul 02 '13 17:07 sjackman

Hi Derek, @pezmaster31

The pileup-like output format I'd suggest would be eight columns in a tabular format, and quite a bit more concise than the pileup format when the additional information that pileup provides isn't necessary:

chr pos ref A C G T N

Where ref is the reference base, and A, C, G, T and N are the number of reads supporting each base at that position. Thoughts?

Cheers, Shaun

sjackman avatar Jul 15 '13 20:07 sjackman

@sjackman, would this tool by @arq5x fit your needs? https://github.com/arq5x/piledriver

jts avatar Jul 24 '13 21:07 jts

Thanks @jts, I'd forgotten about @arq5x's tool.

Will that work for you, @sjackman? It's more detailed than what you're looking for, but avoids re-inventing the wheel. At a quick glance, I don't see a 'num_N' field. Might be able to infer that from the data, not sure. If not, I know Aaron is open to feedback & suggestions.

pezmaster31 avatar Jul 25 '13 17:07 pezmaster31

Yeah, num_N is not implemented, but could be very easily. Unfortunately, I am on my way up to Canada for vacation, but can hack it when I get back.

arq5x avatar Jul 26 '13 01:07 arq5x

Yes, piledriver looks very useful. It's also 30% faster than genome/bam-readcount (the other tool that I found for this job). Thanks, all. Derek, Aaron, would there be any interest in incorporating piledriver into bamtools? It seems to me that a simple pileup tool is conspicuously missing from bamtools.

sjackman avatar Jul 30 '13 18:07 sjackman