edmr
edmr copied to clipboard
edmr does not display neither the correct number of CpGs inside region nor the correct methylation status (hyper/hypo)
Hello, I am using edmr right after MethylKit but the DMRs I get, correspond to a region where there are more CpGs than displayed in "num.CpGs" column. The code I run is the following:
dmp_rrbs <- calculateDiffMeth(methobj) #plus some extra arguments
myDiff <- getData(dmp_rrbs)
dmrs <- edmr(myDiff, mode=2)
dmrs_data <- data.frame(dmrs)
From the first few row of the table I get this:
seqnames start end width strand mean.meth.diff num.CpGs num.DMCs
1 chr11 18659110 18659225 116 * 21.96973 4 1
2 chr16 84941109 84941696 588 * 27.48481 7 4
3 chr21 45417327 45417691 365 * 21.00500 3 1
but when I check the original dmp_rrbs object for the region in chromosome 11 I get more CpGs:
chr start end strand pvalue qvalue meth.diff
1048864 chr11 18659110 18659110 * 7.946860e-01 9.749628e-01 0.08623922
1048865 chr11 18659111 18659111 * 7.031369e-04 6.705592e-02 6.79611650
1048866 chr11 18659133 18659133 * 5.366955e-01 9.025498e-01 -1.72103487
1048867 chr11 18659134 18659134 * 6.658160e-01 9.433952e-01 0.82227066
1048868 chr11 18659135 18659135 * 3.440149e-01 8.120775e-01 2.93494563
1048869 chr11 18659136 18659136 * 6.894213e-01 9.496794e-01 -1.05508223
1048870 chr11 18659156 18659156 * 7.604073e-01 9.673050e-01 -1.36898355
1048871 chr11 18659157 18659157 * 5.058999e-01 8.909928e-01 0.50603119
1048872 chr11 18659199 18659199 * 9.590265e-01 1.000000e+00 0.42335411
1048873 chr11 18659200 18659200 * 2.459876e-03 1.216149e-01 9.16385972
1048874 chr11 18659211 18659211 * 7.247924e-04 6.804295e-02 56.07736533
1048875 chr11 18659212 18659212 * 6.300830e-01 9.331640e-01 19.87319200
1048876 chr11 18659217 18659217 * 4.286156e-01 8.574166e-01 1.47169104
1048877 chr11 18659218 18659218 * 1.617051e-01 6.547773e-01 13.41207086
1048878 chr11 18659224 18659224 * 2.414317e-01 7.379108e-01 6.03896104
1048879 chr11 18659225 18659225 * 1.377561e-15 5.878009e-11 15.84158416
and they are not the same type of differentially methylated (hyper or hypo) as I had requested with the "type=2" argument. Shouldn't there be only 4 CpGs one of which being differentially methylated in that region (i.e. the meth.diff should be either negative or positive for all)?
Can you help me out please? Am I missing something or is it a bug of the tool? I am using R 3.6.0 and hg38 genome for RRBS samples.
Thank you in advance
The mode = 2 means you are using the CpGs with methylation change in the same direction to call DMRs, which take the last column “meth.diff” value into account.
The reason you see fewer CpGs in the DMRs output is because:
- eDMR.sub() by default applies fuzzypval = 0.1, which means that only CpG sites with pvalue <0.1 were considered for DMR calling.
-
- Mode = 2 considers CpGs with meth.diff > 0 and meth.diff < 0 separately.
I updated edmr code to pass the fuzzypval value to the eDMR.sub() so that your analysis will not filter CpGs by pvalue < 0.1. Could you give a try again? Thanks!
Hello,
I updated the edmr package and confirmed that both hyper.myDMR() and hypo.myDMR() variables have the fuzzypval = fuzzypval
extra argument. Nevertheless the problem remains. For example when I run:
dmp_rrbs <- calculateDiffMeth(methobj) #plus some extra arguments
myDiff <- getData(dmp_rrbs)
dmrs <- edmr(myDiff, mode=2)
dmrs_data <- data.frame(dmrs)
head(dmrs_data)
I get the following:
seqnames start end width strand mean.meth.diff num.CpGs num.DMCs
1 chr1 1117516 1117655 140 * 33.60520 14 8
2 chr1 1736319 1736620 302 * 26.74212 11 7
3 chr1 1831467 1831497 31 * 32.87444 5 3
4 chr1 1901833 1902126 294 * 20.58521 8 4
5 chr1 2138975 2139281 307 * 26.95013 8 4
6 chr1 2488928 2488985 58 * 27.96966 6 4
Then if I search for the 3rd dmr:
keep <- which(dmp_rrbs$chr=="chr1" & dmp_rrbs$start>=1831467 & dmp_rrbs$end<=1831497)
data.frame(dmp_rrbs[keep,])
I get the following:
--------------
chr start end strand pvalue qvalue meth.diff
20484 chr1 1831468 1831468 + 3.861869e-03 3.173452e-02 0.502854
20485 chr1 1831485 1831485 + 9.891393e-06 1.303714e-04 -36.076610
20486 chr1 1831486 1831486 - 1.015186e-24 8.679493e-23 57.931034
20487 chr1 1831490 1831490 + 2.930524e-22 2.024816e-20 15.770796
20488 chr1 1831491 1831491 - 1.412065e-04 1.559009e-03 25.378788
20489 chr1 1831496 1831496 + 1.242588e-13 3.871686e-12 -28.672427
20490 chr1 1831497 1831497 - 4.253546e-26 4.080455e-24 64.788732
As you see I still get more CpGs and they are still not the same type of differentially methylated (hyper or hypo) as I had requested with the "mode=2" argument.
What am I missing here?
Thank you in advance.