A way to calculate coverage breadth
It is often useful to see if reads horizontally cover a certain percentage of a locus' length. Existing tools (bedtools coverage) are too slow/memory-consuming with unsorted BAMs.
Hi @BharatRaviIyengar,
thanks for the feedback. Can you provide a minimal example of the desired output and your bedtools coverage command to be able to test how that would work with BamToCov? Thanks
@telatin Thank you for getting back. The desired output would be a tab separated file with:
- locus name
- locus start (optional)
- locus stop (optional)
- orientation (+/-)
- coverage breadth (what % of the locus is covered by seq reads)
- average coverage depth (that BamToCov already reports).
Input files would be a GTF/BED and a SAM/BAM
You can see how the bedtools coverage output looks like here. I am still pasting one of the example outputs
$ cat A.bed
chr1 0 100 b1 1 +
chr1 100 200 b2 1 -
chr2 0 100 b3 1 +
$ cat B.bed
chr1 10 20 a1 1 -
chr1 20 30 a2 1 -
chr1 30 40 a3 1 -
chr1 100 200 a4 1 +
$ bedtools coverage -a A.bed -b B.bed
chr1 0 100 b1 1 + 3 30 100 0.3000000
chr1 100 200 b2 1 - 1 100 100 1.0000000
chr2 0 100 b3 1 + 0 0 100 0.0000000
$ bedtools coverage -a A.bed -b B.bed -s
chr1 0 100 b1 1 + 0 0 100 0.0000000
chr1 100 200 b2 1 - 0 0 100 0.0000000
chr2 0 100 b3 1 + 0 0 100 0.0000000
My File-A (-a) is a GTF and File-B (-b) is a BAM