clusterProfiler icon indicating copy to clipboard operation
clusterProfiler copied to clipboard

Different gseKEGG results for the same input data

Open jinshengheshi opened this issue 2 years ago • 3 comments

image Hello, I meet a problem: I get several different gseKEGG results for the same data with the same code . I try to make 'nPerm = 10000 ' , it makes no difference. Later, I find 'make.seed()' still fails , and i have no idea. How could I get the same result with gseKEGG? Gratefully appreciate.

jinshengheshi avatar May 17 '22 07:05 jinshengheshi

Well , my Package clusterProfiler version is 4.4.1 and my R version is 4.2.0.

jinshengheshi avatar May 17 '22 07:05 jinshengheshi

Although it is not that easy to find, this has been asked and answered before; see: https://github.com/YuLab-SMU/DOSE/issues/45

Specific for your case/question:

First of all, you should use set.seed(), not make.seed()! Also note that the arguments that the function gseKEGG uses have been changed, and are different now compared to what you show above. See ?gseKEGG.

Having said this, to obtain identical results between runs you also have to include the argument seed = TRUE when calling gseKEGG. If you do so you will get identical results, as illustrated by e.g. identical p-values obtained in both runs:

> set.seed(1234)
> 
> library(clusterProfiler)
> library(org.Hs.eg.db)
>  
> data(geneList, package="DOSE")
>  
> run1 <- gseKEGG(geneList     = geneList,
+                organism      = 'hsa',
+                eps           = 0.0,
+                minGSSize     = 10,
+                maxGSSize     = 500,
+                pAdjustMethod = "none",
+                pvalueCutoff  = 1,
+                verbose       = FALSE,
+                seed          = TRUE)
>  
> run2 <- gseKEGG(geneList     = geneList,
+                organism      = 'hsa',
+                eps           = 0.0,
+                minGSSize     = 10,
+                maxGSSize     = 500,
+                pAdjustMethod = "none",
+                pvalueCutoff  = 1,
+                verbose       = FALSE,
+                seed          = TRUE)
> 
> merged.res <- merge( as.data.frame(run1), as.data.frame(run2), by.x="ID", by.y="ID")
> plot(  merged.res$pvalue.x, merged.res$pvalue.y ) 
> 
>

afbeelding

Lastly, next time please copy/paste you code, and don't use a screenshot. Also including a reproducible example will life much easier!

guidohooiveld avatar May 20 '22 13:05 guidohooiveld

   Thanks for your detailed guidance.  I have solved my problem just now.  Gratefully appreciate😀😀😀

2022年5月20日 21:33,Guido Hooiveld @.***> 写道:

Although it is not that easy to find, this has been asked and answered before; see: YuLab-SMU/DOSE#45 https://github.com/YuLab-SMU/DOSE/issues/45 Specific for your case/question:

First of all, you should use set.seed(), not make.seed()! Also note that the arguments that the function gseKEGG uses have been changed, and are different now compared to what you show above. See ?gseKEGG.

Having said this, to obtain identical results between runs you also have to include the argument seed = TRUE when calling gseKEGG. If you do so you will get identical results, as illustrated by e.g. identical p-values obtained in both runs:

set.seed(1234)

library(clusterProfiler) library(org.Hs.eg.db)

data(geneList, package="DOSE")

run1 <- gseKEGG(geneList = geneList,

  •            organism      = 'hsa',
    
  •            eps           = 0.0,
    
  •            minGSSize     = 10,
    
  •            maxGSSize     = 500,
    
  •            pAdjustMethod = "none",
    
  •            pvalueCutoff  = 1,
    
  •            verbose       = FALSE,
    
  •            seed          = TRUE)
    

run2 <- gseKEGG(geneList = geneList,

  •            organism      = 'hsa',
    
  •            eps           = 0.0,
    
  •            minGSSize     = 10,
    
  •            maxGSSize     = 500,
    
  •            pAdjustMethod = "none",
    
  •            pvalueCutoff  = 1,
    
  •            verbose       = FALSE,
    
  •            seed          = TRUE)
    

merged.res <- merge( as.data.frame(run1), as.data.frame(run2), by.x="ID", by.y="ID") plot( merged.res$pvalue.x, merged.res$pvalue.y )

https://user-images.githubusercontent.com/22979991/169538101-1ed128a8-953a-4083-abc6-84483007ea7f.png Lastly, next time please copy/paste you code, and don't use a screenshot. Also including a reproducible example will life much easier!

— Reply to this email directly, view it on GitHub https://github.com/YuLab-SMU/clusterProfiler/issues/466#issuecomment-1132907005, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQFCEOUJVJKQ6SE3BIRRHMTVK6IBNANCNFSM5WD5AFHQ. You are receiving this because you authored the thread.

jinshengheshi avatar May 29 '22 15:05 jinshengheshi