clusterProfiler
clusterProfiler copied to clipboard
No gene can be mapped when using enrichKEGG
I am using the enrichKEGG function, but my code, that was working last week does not work anymore. Also the example in the vignette doesn't work anymore:
data(geneList, package="DOSE") #Vignette example gene_names <- names(geneList)[abs(geneList) > 2] kegg_enrich <- enrichKEGG(gene = gene_names, organism = "hsa", pvalueCutoff= 0.05, qvalueCutoff= 0.2) --> No gene can be mapped.... --> Expected input gene ID: --> return NULL...
I can confirm, I am experiencing the same issue with a script that I haven't touched for 9 months. The example in the help file for enrichKEGG is returning the same error.
> data(geneList, package='DOSE')
> de <- names(geneList)[1:100]
> yy <- enrichKEGG(de, pvalueCutoff=0.01)
--> No gene can be mapped....
--> Expected input gene ID:
--> return NULL...
> head(yy)
NULL
Session Information:
> sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Ventura 13.1
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggfortify_0.4.15 qvalue_2.28.0 pheatmap_1.0.12 ggVennDiagram_1.2.2 ROntoTools_2.24.0
[6] Rgraphviz_2.40.0 KEGGgraph_1.56.0 KEGGREST_1.36.3 boot_1.3-28.1 graph_1.74.0
[11] BiocGenerics_0.42.0 clusterProfiler_4.4.4 venn_1.11 ggplot2_3.4.0 openxlsx_4.2.5.1
[16] dplyr_1.0.10 tibble_3.1.8 tidyr_1.2.1 edgeR_3.38.4 limma_3.52.4
loaded via a namespace (and not attached):
[1] fgsea_1.22.0 colorspace_2.0-3 ggtree_3.4.4 XVector_0.36.0
[5] aplot_0.1.9 rstudioapi_0.14 farver_2.1.1 graphlayouts_0.8.4
[9] ggrepel_0.9.2 bit64_4.0.5 AnnotationDbi_1.58.0 fansi_1.0.3
[13] scatterpie_0.1.8 codetools_0.2-18 splines_4.2.2 cachem_1.0.6
[17] GOSemSim_2.22.0 polyclip_1.10-4 jsonlite_1.8.4 GO.db_3.15.0
[21] png_0.1-8 ggforce_0.4.1 compiler_4.2.2 httr_1.4.4
[25] assertthat_0.2.1 Matrix_1.5-3 fastmap_1.1.0 lazyeval_0.2.2
[29] cli_3.5.0 tweenr_2.0.2 admisc_0.30 tools_4.2.2
[33] igraph_1.3.5 gtable_0.3.1 glue_1.6.2 GenomeInfoDbData_1.2.8
[37] reshape2_1.4.4 DO.db_2.9 fastmatch_1.1-3 Rcpp_1.0.9
[41] enrichplot_1.16.2 Biobase_2.56.0 vctrs_0.5.1 Biostrings_2.64.1
[45] ape_5.6-2 nlme_3.1-161 ggraph_2.1.0 stringr_1.5.0
[49] lifecycle_1.0.3 XML_3.99-0.13 DOSE_3.22.1 org.Hs.eg.db_3.15.0
[53] zlibbioc_1.42.0 MASS_7.3-58.1 scales_1.2.1 tidygraph_1.2.2
[57] parallel_4.2.2 RColorBrewer_1.1-3 memoise_2.0.1 gridExtra_2.3
[61] downloader_0.4 ggfun_0.0.9 yulab.utils_0.0.6 stringi_1.7.8
[65] RSQLite_2.2.20 S4Vectors_0.34.0 tidytree_0.4.2 zip_2.2.2
[69] BiocParallel_1.30.4 GenomeInfoDb_1.32.4 rlang_1.0.6 pkgconfig_2.0.3
[73] bitops_1.0-7 lattice_0.20-45 purrr_1.0.0 treeio_1.20.2
[77] patchwork_1.1.2 shadowtext_0.1.2 bit_4.0.5 tidyselect_1.2.0
[81] plyr_1.8.8 magrittr_2.0.3 R6_2.5.1 IRanges_2.30.1
[85] generics_0.1.3 DBI_1.1.3 pillar_1.8.1 withr_2.5.0
[89] RCurl_1.98-1.9 crayon_1.5.2 utf8_1.2.2 RVenn_1.1.0
[93] viridis_0.6.2 locfit_1.5-9.6 data.table_1.14.6 blob_1.2.3
[97] digest_0.6.31 gridGraphics_0.5-1 stats4_4.2.2 munsell_0.5.0
[101] viridisLite_0.4.1 ggplotify_0.1.0
Converting the entrez ID's does not seem to help either.
> x <- paste0("hsa:",gcSample[[1]])
> x
[1] "hsa:4597" "hsa:7111" "hsa:5266" "hsa:2175" "hsa:755" "hsa:23046"
[7] "hsa:3931" "hsa:6770" "hsa:993" . . . . . . . . . . . . the rest of the KEGG ID's
enrichKEGG(
+ x,
+ organism = "hsa",
+ keyType = "kegg",
+ pvalueCutoff = 0.05,
+ pAdjustMethod = "BH",
+ universe,
+ minGSSize = 10,
+ maxGSSize = 500,
+ qvalueCutoff = 0.2,
+ use_internal_data = FALSE
+ )
--> No gene can be mapped....
--> Expected input gene ID:
--> return NULL...
NULL
This is kinda stinky, I need this figure for my Dissertation which is supposed to go to comittee on Monday evening
I may try and revert to a previous version.
Upgrading BioConductor to the most current version fixed this for me.
Had to reinstall R from scratch to the last version, and also Bioconductor and also all possible packages to the last versions and it worked
Same problem. Do I have to upgrade or downgrade the package to a certain version?
I updated all and hoped for the best (that other scripts I usevwill keep working)...also update/installed all packages required by the installation of clusterProfiler since I could not be sure of what is critical and what not
搞死人的,新版本要最新的DOSE,只能升级整个R,还要重装调整之前所有包的兼容性,真的醉了。
所以它看起來
:)
I updated Bioconductor, clusterProfiler and DOSE and still got error as mentioned before
enrichKEGG(as.character(GENES.ENTREZ$ENTREZID),
organism = "hsa",
keyType = "kegg",
pvalueCutoff = 0.05,
pAdjustMethod = "BH",
res.1$GeneSymbol,
minGSSize = 10,
maxGSSize = 500,
qvalueCutoff = 0.05,
use_internal_data = FALSE)
--> No gene can be mapped....
--> Expected input gene ID:
--> return NULL...
DOSE v3.24.2 clusterProfiler 4.6.2
I got the same issue, even though the packages are updated to the latest version.
R 4.2.2 DOSE_3.24.2 clusterProfiler_4.6.2
FYI I had a quick solution here
-
Update the clusterprofilier to the latest Github version ( the lastest version is 4.7.1.3)
remotes::install_github("YuLab-SMU/clusterProfiler")
-
Establish a local KEGG database
# install the packages remotes::install_github("YuLab-SMU/createKEGGdb") # import the library and create a KEGG database locally library(createKEGGdb) species <-c("ath","hsa","mmu", "rno","dre","dme","cel") createKEGGdb::create_kegg_db(species) # You will get KEGG.db_1.0.tar.gz file in your working directory
-
install the KEGG.db and import it
install.packages("KEGG.db_1.0.tar.gz", repos=NULL,type="source") library(KEGG.db)
-
add use_internal_data=T in your enrichKEGG function
data(gcSample) yy = enrichKEGG(gcSample[[5]], pvalueCutoff=0.01, use_internal_data=T) head(summary(yy))
@dppss90008 Sweet, this works!!
the problem is the function OrganismMapper. "hsa" should be the input does not need to be mapped.
Sweet, this workaround did the trick.
Do we need to update everything to make it work without the local DB?
thanks
I debuged the code step by step, and located the bug:
clusterProfiler::enricher_internal
<=KEGG_DATA
<=prepare_KEGG(species, "KEGG", keyType)
<=kegg <- download_KEGG(species, KEGG_Type, keyType)
<=if (use_cached)
.
when the cache detected, the program used cache instead of downloading the data from website.
the simple way to solve this problem here is to clean the cache, i.e. delete the .RData at worplace and force the program downloads the new data from web.
I reinstalled the clusterProfiler from Bioconductor, and the bug also did not appear.
----
System: MacOS arm, R-4.2.2,
I debuged the code step by step, and located the bug:
clusterProfiler::enricher_internal
<=KEGG_DATA
<=prepare_KEGG(species, "KEGG", keyType)
<=kegg <- download_KEGG(species, KEGG_Type, keyType)
<=if (use_cached)
. when the cache detected, the program used cache instead of downloading the data from website. the simple way to solve this problem here is to clean the cache, i.e. delete the .RData at worplace and force the program downloads the new data from web. I reinstalled the clusterProfiler from Bioconductor, and the bug also did not appear. ---- System: MacOS arm, R-4.2.2,
you missed a step see above, that is why it is going to cache.
Hello, I tried to re-install with remotes::install_github("YuLab-SMU/clusterProfiler")
but i got errors from it (copied to down below). Anyone got any advice?
The downloaded source packages are in ‘/tmp/RtmpGtkzzV/downloaded_packages’ ✔ checking for file ‘/tmp/RtmpGtkzzV/remotes378e41ddad11d/YuLab-SMU-clusterProfiler-127278c/DESCRIPTION’ (358ms) ─ preparing ‘clusterProfiler’: ✔ checking DESCRIPTION meta-information ... ─ checking for LF line-endings in source and make files and shell scripts ─ checking for empty or unneeded directories ─ looking to see if a ‘data/datalist’ file should be added ─ building ‘clusterProfiler_4.7.1.003.tar.gz’ Warning in sprintf(gettext(fmt, domain = domain, trim = trim), ...) : one argument not used by format 'invalid uid value replaced by that for user 'nobody'' Warning: invalid uid value replaced by that for user 'nobody' Warning in sprintf(gettext(fmt, domain = domain, trim = trim), ...) : one argument not used by format 'invalid gid value replaced by that for user 'nobody'' Warning: invalid gid value replaced by that for user 'nobody'
Installing package into ‘/home/nshen7/R/rstudio_4_2_0’ (as ‘lib’ is unspecified)
- installing source package ‘clusterProfiler’ ... ** using staged installation ** R ** data ** inst ** byte-compile and prepare package for lazy loading Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) : namespace ‘DOSE’ 3.22.1 is being loaded, but >= 3.23.2 is required Calls: <Anonymous> ... namespaceImportFrom -> asNamespace -> loadNamespace Execution halted ERROR: lazy loading failed for package ‘clusterProfiler’
- removing ‘/home/nshen7/R/rstudio_4_2_0/clusterProfiler’ Warning message: In i.p(...) : installation of package ‘/tmp/RtmpGtkzzV/file378e46006fe62/clusterProfiler_4.7.1.003.tar.gz’ had non-zero exit status
HERE: DOSE’ 3.22.1 is being loaded, but >= 3.23.2 is required
You need to update the DOSE package\install version 3.23.2
O try some of the previously suggested solutions
FYI I had a quick solution here
- Update the clusterprofilier to the latest Github version ( the lastest version is 4.7.1.3)
remotes::install_github("YuLab-SMU/clusterProfiler")
- Establish a local KEGG database
# install the packages remotes::install_github("YuLab-SMU/createKEGGdb") # import the library and create a KEGG database locally library(createKEGGdb) species <-c("ath","hsa","mmu", "rno","dre","dme","cel") createKEGGdb::create_kegg_db(species) # You will get KEGG.db_1.0.tar.gz file in your working directory
- install the KEGG.db and import it
install.packages("KEGG.db_1.0.tar.gz", repos=NULL,type="source") library(KEGG.db)
- add use_internal_data=T in your enrichKEGG function
data(gcSample) yy = enrichKEGG(gcSample[[5]], pvalueCutoff=0.01, use_internal_data=T) head(summary(yy))
It's work for me! thank you
HERE: DOSE’ 3.22.1 is being loaded, but >= 3.23.2 is required
You need to update the DOSE package\install version 3.23.2
O try some of the previously suggested solutions
Please, how to update the DOSE package to version 3.23.2 ? I don't want to update my R to version 4.2.
I have the same problem with "no genes mapped" when using the gseKEGG from ClusterProfiler. Everuthing worked fine a month ago
I have tried updating bioconducter to the newest version 3.16. I tired if I could update both "DOSE" and "ClusterProfiler" but that I am not sure went well.
How to check that?
Can someone guide me to a solution (I am relative new in R so it needs to be for dummies :D )
I have the same problem with "no genes mapped" when using the gseKEGG from ClusterProfiler. Everuthing worked fine a month ago
I have tried updating bioconducter to the newest version 3.16. I tired if I could update both "DOSE" and "ClusterProfiler" but that I am not sure went well.
How to check that?
Can someone guide me to a solution (I am relative new in R so it needs to be for dummies :D )
Solved - I restarted R and then the updates were done and the Code works again
Same problem, but if I use enrichMKEGG
it works. What is the difference between the two functions?
@DavideBrex Please make sure you are using the latest version of clusterProfiler and please provide your sessioninfo.
I am facing the same issue also with latest version
@DavideBrex Please make sure you are using the latest version of clusterProfiler and please provide your sessioninfo.
I am using last version. Thank you for the support.
R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=it_IT.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=it_IT.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=it_IT.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] openxlsx_4.2.5.1 org.Mm.eg.db_3.16.0 org.Hs.eg.db_3.16.0 AnnotationDbi_1.60.0
[5] IRanges_2.32.0 S4Vectors_0.36.0 Biobase_2.58.0 BiocGenerics_0.44.0
[9] ReactomePA_1.42.0 clusterProfiler_4.6.2 forcats_0.5.2 stringr_1.4.1
[13] dplyr_1.0.10 purrr_0.3.5 readr_2.1.3 tidyr_1.2.1
[17] tibble_3.1.8 ggplot2_3.4.0 tidyverse_1.3.2
loaded via a namespace (and not attached):
[1] readxl_1.4.1 shadowtext_0.1.2 backports_1.4.1 fastmatch_1.1-3
[5] plyr_1.8.7 igraph_1.3.5 lazyeval_0.2.2 splines_4.2.2
[9] BiocParallel_1.32.1 GenomeInfoDb_1.34.2 digest_0.6.30 yulab.utils_0.0.5
[13] GOSemSim_2.24.0 viridis_0.6.2 GO.db_3.16.0 fansi_1.0.3
[17] magrittr_2.0.3 memoise_2.0.1 googlesheets4_1.0.1 tzdb_0.3.0
[21] Biostrings_2.66.0 graphlayouts_0.8.3 modelr_0.1.9 timechange_0.1.1
[25] enrichplot_1.18.0 colorspace_2.0-3 blob_1.2.3 rvest_1.0.3
[29] rappdirs_0.3.3 ggrepel_0.9.2 haven_2.5.1 crayon_1.5.2
[33] RCurl_1.98-1.9 jsonlite_1.8.3 graph_1.76.0 scatterpie_0.1.8
[37] ape_5.6-2 glue_1.6.2 polyclip_1.10-4 gtable_0.3.1
[41] gargle_1.2.1 zlibbioc_1.44.0 XVector_0.38.0 graphite_1.44.0
[45] scales_1.2.1 DOSE_3.24.1 DBI_1.1.3 Rcpp_1.0.9
[49] viridisLite_0.4.1 gridGraphics_0.5-1 tidytree_0.4.1 bit_4.0.4
[53] reactome.db_1.82.0 httr_1.4.4 fgsea_1.24.0 RColorBrewer_1.1-3
[57] ellipsis_0.3.2 pkgconfig_2.0.3 farver_2.1.1 dbplyr_2.2.1
[61] utf8_1.2.2 ggplotify_0.1.0 tidyselect_1.2.0 rlang_1.0.6
[65] reshape2_1.4.4 munsell_0.5.0 cellranger_1.1.0 tools_4.2.2
[69] cachem_1.0.6 downloader_0.4 cli_3.4.1 generics_0.1.3
[73] RSQLite_2.2.18 gson_0.0.9 broom_1.0.1 fastmap_1.1.0
[77] ggtree_3.6.1 bit64_4.0.5 fs_1.5.2 tidygraph_1.2.2
[81] zip_2.2.2 KEGGREST_1.38.0 ggraph_2.1.0 nlme_3.1-160
[85] aplot_0.1.8 xml2_1.3.3 compiler_4.2.2 rstudioapi_0.14
[89] png_0.1-7 reprex_2.0.2 treeio_1.22.0 tweenr_2.0.2
[93] stringi_1.7.8 lattice_0.20-45 Matrix_1.5-1 vctrs_0.5.0
[97] pillar_1.8.1 lifecycle_1.0.3 BiocManager_1.30.19 data.table_1.14.4
[101] cowplot_1.1.1 bitops_1.0-7 patchwork_1.1.2 qvalue_2.30.0
[105] R6_2.5.1 gridExtra_2.3 codetools_0.2-18 MASS_7.3-58.1
[109] assertthat_0.2.1 withr_2.5.0 GenomeInfoDbData_1.2.9 parallel_4.2.2
[113] hms_1.1.2 grid_4.2.2 ggfun_0.0.8 HDO.db_0.99.1
[117] googledrive_2.0.0 ggforce_0.4.1 lubridate_1.9.0
@DavideBrex From the sessioninfo, I see your DOSE is not the release version, please update it. If you still report an error, you can try using the createKEGGdb: https://github.com/YuLab-SMU/clusterProfiler/issues/561#issuecomment-1467266614
@huerqiang @dppss90008
I copied and ran @dppss90008 above scripts. However, I got the below error. I believe there is something wrong with the createKEGGdb code. Can you please check it? Thanks!
library(createKEGGdb) species <- c("ath","hsa","mmu", "rno","dre","dme","cel") createKEGGdb::create_kegg_db(species)
Error in clusterProfiler:::kegg_list("pathway", i) : unused argument (i)
@Wenjuan-ZHU createKEGGdb is OK. Please update your clusterProfiler.
@huerqiang Thanks! it works after I reload clusterProfiler.