orthogene icon indicating copy to clipboard operation
orthogene copied to clipboard

Identifier mapping to Ensembl identifiers

Open mkutmon opened this issue 2 years ago β€’ 4 comments

I tried to figure out how I can change the output to Ensembl identifiers instead of gene symbols. I tried adding the argument "numeric_ns="ENSG" but that didn't help. Do you have a hint on how I can achieve that?

mkutmon avatar May 17 '22 12:05 mkutmon

Hi @mkutmon, which function are you trying to use? Could you provide a quick reproducible example?

bschilder avatar May 20 '22 20:05 bschilder

I have a list of human Ensembl identifiers and would like to get the mouse Ensembl identifiers back.

mapped.data <- orthogene::convert_orthologs(gene_df = human.ids,
                                        gene_input = "GeneID", 
                                        gene_output = "columns", 
                                        input_species = "human",
                                        output_species = "mouse",
                                        non121_strategy = "kbs",
                                        method = method)

Currently, this method results in a new column "ortholog_gene" which is the mouse gene name. I would like to have the Ensembl identifier for mouse (ENSMUSG...). Is that possible?

mkutmon avatar May 24 '22 10:05 mkutmon

I can try and infer your use case from the above code snippet, but I'm afraid the above is not a reproducible example (i.e. i can copy and paste the code into R and it will reproduce the problem). You can read about how to make a reprex here. For future bug reports I've added an Issues template to guide users. I've attached the template for you to use here as well. bugs_template.txt

bschilder avatar May 24 '22 11:05 bschilder

Here's an example of a reprex that i think approximates your use case:

human_genes  <- orthogene::all_genes(species = "human")
method <- "gprofiler2"


mapped.data <- orthogene::convert_orthologs(gene_df = human_genes$target[1:10], 
                                            standardise_genes = TRUE,
                                            gene_output = "columns", 
                                            input_species = "human",
                                            output_species = "mouse",
                                            non121_strategy = "kbs",
                                            method = method)

mouse_genes <- orthogene::map_genes(genes = mapped.data$ortholog_gene, 
                                    species = "mouse")

Screenshot 2022-05-24 at 12 51 21

Note standardise_genes = TRUE. This means that your input ensembl IDs will be translated to human gene symbols first. These can then be translated to mouse gene symbols. From the docs: Screenshot 2022-05-24 at 12 50 23

That said, I think a nice feature would be to do this all in one step, and return convert_orthologs as whatever gene format is requested (not just gene symbols). I'll look into adding this feature to the next release of orthogene.

bschilder avatar May 24 '22 11:05 bschilder