multiGSEA icon indicating copy to clipboard operation
multiGSEA copied to clipboard

Support gene identifier conversion in getReactomeGeneSetDb

Open tomsing1 opened this issue 6 years ago • 2 comments

https://github.com/lianos/multiGSEA/blob/19006d1053db8de807b80faa2a967a49d0b2ab38/R/get-reactome.R#L15

It looks like the id.col is not used, e.g. specifying ensembl as the desired featureType doesn't have an effect?

tomsing1 avatar Feb 01 '19 20:02 tomsing1

Indeed!

If the reactome.db has ensembl identifiers in there, this should be a straightforward fix ...

In the meantime, if you want to put some elbow grease into this, something like the below should work:

library(multiGSEA)
library(dplyr)
gdb.entrez <- getReactomeGeneSetDb(...)
gdb.ens <-  gdb.entrez %>%
  as.data.frame() %>%
  mutate(ensembl = entrez2ensembl(featureId)) %>%
  filter(!is.na(ensembl)) %>%
  distinct(name, featureId, .keep_all = TRUE) %>%
  transmute(collection, name, featureId = ensembl) %>%
  GeneSetDb()

Where you provide your favorite implementation for something like theentrez2ensembl function which provides a 1:1 mapping from entrez id vector to an ensembl id's.

... "simple" :-)

lianos avatar Feb 01 '19 20:02 lianos

Thanks a lot for the workaround!

tomsing1 avatar Feb 01 '19 20:02 tomsing1