dipcall
dipcall copied to clipboard
Interpretation of bed files
$ cat prefix.dip.bed | awk 'BEGIN{SUM=0}{ SUM+=$3-$2 }END{print SUM}'
2823519412
$ cat prefix.hap1.bed | awk 'BEGIN{SUM=0}{ SUM+=$3-$2 }END{print SUM}'
2690214366
$ cat prefix.hap2.bed | awk 'BEGIN{SUM=0}{ SUM+=$3-$2 }END{print SUM}'
2817991873
As per the documentation: The prefix.dip.bed
file gives the confident regions. A base is included in the BED if 1) it is covered by one >=50kb alignment with mapQ>=5 from each parent and 2) it is not covered by other >=10kb alignments in each parent. Based on this, shouldn't the length of intervals in prefix.dip.bed
be lower than prefix.hap1.bed
and prefix.hap2.bed
, i.e., should prefix.dip.bed
have been intersection of the two haplotype-specific bed files?
Please suggest what is the relationship between these three bed files.
This is probably caused by the sex chromosomes. chrX and chrY are handled differently.