clusterProfiler
clusterProfiler copied to clipboard
enrichKegg Bug
Dear Prof. Guangchuang Yu,
I am an avid user of your package and I want to express my sincere appreciation for all the work and effort you put into it. I particularly like you're creativity when it comes to visualisation and the ease of use of your packages. So thank you very much for that !
The enrichKegg
function does not work on my system. It should be something with the USER_DATA object according to my debugging observation.
Example:
data(geneList, package='DOSE') de <- names(geneList)[1:100] yy <- enrichKEGG(de, pvalueCutoff=0.01) head(yy)
It throws me the following error:
you can try pvalueCutoff=1
You should provide more information on your R/Bioconductor installation! Are you sure it is up-to-date? That is, using R-4.3.x
and Bioconductor 3.18
? There have been changes in the KEGG API the last year, and this may explain why it doesn't work for you anymore... It does for me, using the current versions of R/Bioconductor....!
> library(clusterProfiler)
> data(geneList, package='DOSE')
> de <- names(geneList)[1:100]
> yy <- enrichKEGG(de, pvalueCutoff=0.01)
Reading KEGG annotation online: "https://rest.kegg.jp/link/hsa/pathway"...
Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/hsa"...
> head(yy)
category subcategory ID
hsa04110 Cellular Processes Cell growth and death hsa04110
hsa04218 Cellular Processes Cell growth and death hsa04218
hsa04114 Cellular Processes Cell growth and death hsa04114
hsa04814 Cellular Processes Cell motility hsa04814
hsa04657 Organismal Systems Immune system hsa04657
Description GeneRatio BgRatio pvalue p.adjust
hsa04110 Cell cycle 12/58 157/8644 3.667200e-10 4.547329e-08
hsa04218 Cellular senescence 7/58 156/8644 7.570813e-05 4.693904e-03
hsa04114 Oocyte meiosis 6/58 131/8644 2.292076e-04 8.823322e-03
hsa04814 Motor proteins 7/58 193/8644 2.846233e-04 8.823322e-03
hsa04657 IL-17 signaling pathway 5/58 94/8644 3.972218e-04 9.851100e-03
qvalue
hsa04110 4.207630e-08
hsa04218 4.343256e-03
hsa04114 8.164194e-03
hsa04814 8.164194e-03
hsa04657 9.115195e-03
geneID Count
hsa04110 8318/991/9133/10403/890/983/4085/81620/7272/9212/1111/9319 12
hsa04218 2305/4605/9133/890/983/51806/1111 7
hsa04114 991/9133/983/4085/51806/6790 6
hsa04814 9493/1062/81930/3832/3833/146909/10112 7
hsa04657 4312/6280/6279/6278/3627 5
>
> packageVersion("clusterProfiler")
[1] ‘4.10.0’
> BiocManager::version()
[1] ‘3.18’
> R.Version()$version.string
[1] "R version 4.3.0 (2023-04-21 ucrt)"
>
thanks for your answers. Setting a higher pvalue threshold still yields no enriched KEGG Terms, enrichGO works normally.
> packageVersion("clusterProfiler")
[1] ‘4.4.4’
> BiocManager::version()
[1] ‘3.15’
> R.Version()$version.string
[1] "R version 4.2.3 (2023-03-15 ucrt)"
As said, AFAIK recently (a couple of months ago) there have been some issues with connecting to the KEGG API. These have been addressed, so I strongly recommend you update your R
/Bioconductor
/clusterProfiler
installations to the latest ones.
Based on the behavior you experience (GO analysis is working, KEGG is not), it seems it is specific to KEGG, and this can only be the step in which the gene sets are retrieved (because under the hood enrichGO
and enrichKEGG
converge to the same internal function).
I did some updating:
> packageVersion("clusterProfiler")
[1] ‘4.10.0’
> BiocManager::version()
[1] ‘3.18’
> R.Version()$version.string
[1] "R version 4.3.2 (2023-10-31 ucrt)"
However, the problem persists. From debugging, it seems to me that there is something going wrong when building the KEGG_DATA
object.
The path2gene
that goes into build_anno
appears to be empty, but the path2name
parameter looks legit.
This work around procedure fixed it for me: (https://github.com/YuLab-SMU/clusterProfiler/issues/561#issuecomment-1467266614)
Nice to hear it is working for you now!, but... the KEGG_DATA
object would normally not be required, except if there are problems connecting to the online KEGG site/database.
Could you therefore, in a fresh session of R, run the code in the 3rd post, and paste the full code and output here?
still, the same error. I don't get how you get the prompt
Reading KEGG annotation online:
It should be from this function clusterProfiler:::kegg_rest
. Where does the call happen? I don't see it in download_KEGG