bedtools2
bedtools2 copied to clipboard
Difference of depth/mean coverage calculation among sambamba, bedtools, and mosdepth in high coverage panel data
Hello, Aaron and everyone - Thanks for the great tools you guys have created!
We have some high coverage panel sequencing data, but checking the depth of the regions using mosdepth, bedtools and sambamba, give quite a range of results (results obtained running these commands through snakemake file).
These tools are run with the default setting, what might cause such a huge difference in depth calculations? Are there some filtering or duplicate reads filtering in bedtools for bedtools coverage -mean calculation?
Thanks in advance!
sambamba:
"sambamba depth region -L bed/study_genes.bed {input} > coverage/study_sambamba/interval_coverage/{wildcards.sample}_interval_coverage.txt"
# chrom chromStart chromEnd F3 readCount meanCoverage sampleName
1 36349022 36349047 NM_001317122_cds_0_0_chr1_36349023_f 502 400 Sample-10
1 36354027 36354211 NM_001317122_cds_1_0_chr1_36354028_f 958 372.63 Sample-10
1 36358157 36358278 NM_001317122_cds_2_0_chr1_36358158_f 593 309.859 Sample-10
1 36358697 36358879 NM_001317122_cds_3_0_chr1_36358698_f 777 323.709 Sample-10
1 36359274 36359411 NM_001317122_cds_4_0_chr1_36359275_f 834 374.27 Sample-10
1 36359637 36359772 NM_001317122_cds_5_0_chr1_36359638_f 686 315.548 Sample-10
1 36359915 36360003 NM_001317122_cds_6_0_chr1_36359916_f 669 357.909
...
bedtools:
"bedtools coverage -mean -a bed/study_genes.bed -b {input} > coverage/study/interval_coverage/{wildcards.sample}_interval_coverage.txt"
chr start end gene coverage sample
1 36349022 36349047 NM_001317122_cds_0_0_chr1_36349023_f 43840.9609375 Sample-10
1 36354027 36354211 NM_001317122_cds_1_0_chr1_36354028_f 43905.3320312 Sample-10
1 36358157 36358278 NM_001317122_cds_2_0_chr1_36358158_f 26675.0253906 Sample-10
1 36358697 36358879 NM_001317122_cds_3_0_chr1_36358698_f 32416.6210938 Sample-10
1 36359274 36359411 NM_001317122_cds_4_0_chr1_36359275_f 54923.9648438 Sample-10
1 36359637 36359772 NM_001317122_cds_5_0_chr1_36359638_f 35807.59375 Sample-10
1 36359915 36360003 NM_001317122_cds_6_0_chr1_36359916_f 29420.7265625 Sample-10
mosdepth:
mosdepth -n --by bed/study_genes.bed coverage/study_mosdepth/interval_coverage/{wildcards.sample}-interval {input}
gzip -dc coverage/study_mosdepth/interval_coverage/{wildcards.sample}-interval.regions.bed.gz > {output.interval_coverage}
chr start end gene coverage sample
1 36349022 36349047 NM_001317122_cds_0_0_chr1_36349023_f 289.08 Sample-10
1 36349022 36349047 NM_012199_cds_0_0_chr1_36349023_f 289.08 Sample-10
1 36354027 36354211 NM_001317122_cds_1_0_chr1_36354028_f 287.42 Sample-10
1 36354027 36354211 NM_012199_cds_1_0_chr1_36354028_f 287.42 Sample-10
1 36358157 36358278 NM_001317122_cds_2_0_chr1_36358158_f 277.69 Sample-10
1 36358157 36358278 NM_012199_cds_2_0_chr1_36358158_f 277.69 Sample-10
1 36358173 36358278 NM_001317123_cds_2_0_chr1_36358174_f 278.95 Sample-10
1 36358697 36358879 NM_001317122_cds_3_0_chr1_36358698_f 283.55 Sample-10
...