clusterProfiler
clusterProfiler copied to clipboard
Different gseKEGG results for the same input data
![image](https://user-images.githubusercontent.com/67772986/168752492-2eb894ef-abeb-4d8c-9907-f57d03f9a775.png)
Well , my Package clusterProfiler version is 4.4.1 and my R version is 4.2.0.
Although it is not that easy to find, this has been asked and answered before; see: https://github.com/YuLab-SMU/DOSE/issues/45
Specific for your case/question:
First of all, you should use set.seed()
, not make.seed()
!
Also note that the arguments that the function gseKEGG
uses have been changed, and are different now compared to what you show above. See ?gseKEGG
.
Having said this, to obtain identical results between runs you also have to include the argument seed = TRUE
when calling gseKEGG
. If you do so you will get identical results, as illustrated by e.g. identical p-values obtained in both runs:
> set.seed(1234)
>
> library(clusterProfiler)
> library(org.Hs.eg.db)
>
> data(geneList, package="DOSE")
>
> run1 <- gseKEGG(geneList = geneList,
+ organism = 'hsa',
+ eps = 0.0,
+ minGSSize = 10,
+ maxGSSize = 500,
+ pAdjustMethod = "none",
+ pvalueCutoff = 1,
+ verbose = FALSE,
+ seed = TRUE)
>
> run2 <- gseKEGG(geneList = geneList,
+ organism = 'hsa',
+ eps = 0.0,
+ minGSSize = 10,
+ maxGSSize = 500,
+ pAdjustMethod = "none",
+ pvalueCutoff = 1,
+ verbose = FALSE,
+ seed = TRUE)
>
> merged.res <- merge( as.data.frame(run1), as.data.frame(run2), by.x="ID", by.y="ID")
> plot( merged.res$pvalue.x, merged.res$pvalue.y )
>
>
Lastly, next time please copy/paste you code, and don't use a screenshot. Also including a reproducible example will life much easier!
Thanks for your detailed guidance. I have solved my problem just now. Gratefully appreciate😀😀😀
2022年5月20日 21:33,Guido Hooiveld @.***> 写道:
Although it is not that easy to find, this has been asked and answered before; see: YuLab-SMU/DOSE#45 https://github.com/YuLab-SMU/DOSE/issues/45 Specific for your case/question:
First of all, you should use set.seed(), not make.seed()! Also note that the arguments that the function gseKEGG uses have been changed, and are different now compared to what you show above. See ?gseKEGG.
Having said this, to obtain identical results between runs you also have to include the argument seed = TRUE when calling gseKEGG. If you do so you will get identical results, as illustrated by e.g. identical p-values obtained in both runs:
set.seed(1234)
library(clusterProfiler) library(org.Hs.eg.db)
data(geneList, package="DOSE")
run1 <- gseKEGG(geneList = geneList,
organism = 'hsa',
eps = 0.0,
minGSSize = 10,
maxGSSize = 500,
pAdjustMethod = "none",
pvalueCutoff = 1,
verbose = FALSE,
seed = TRUE)
run2 <- gseKEGG(geneList = geneList,
organism = 'hsa',
eps = 0.0,
minGSSize = 10,
maxGSSize = 500,
pAdjustMethod = "none",
pvalueCutoff = 1,
verbose = FALSE,
seed = TRUE)
merged.res <- merge( as.data.frame(run1), as.data.frame(run2), by.x="ID", by.y="ID") plot( merged.res$pvalue.x, merged.res$pvalue.y )
https://user-images.githubusercontent.com/22979991/169538101-1ed128a8-953a-4083-abc6-84483007ea7f.png Lastly, next time please copy/paste you code, and don't use a screenshot. Also including a reproducible example will life much easier!
— Reply to this email directly, view it on GitHub https://github.com/YuLab-SMU/clusterProfiler/issues/466#issuecomment-1132907005, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQFCEOUJVJKQ6SE3BIRRHMTVK6IBNANCNFSM5WD5AFHQ. You are receiving this because you authored the thread.