EnrichmentMapApp
EnrichmentMapApp copied to clipboard
Add notebook demonstrating how to use fgsea with EM
add notebook to cytoscape workflow github modify output files to be gsea or generic compliant https://bioconductor.org/packages/release/bioc/html/fgsea.html
I used fgsea to perform my GSEA analysis. I want to use EnrichmentMap to visualize my results but I am seeing this error:
Could you please help? I'm uncertain what the exact parameters are for the UP and DOWN files to ensure they are EnrichmentMap compliant.
Thank you
Can you send me a sample of the output files for fgsea (just the top two or three lines of the file will be sufficient).
I sent the files to your email. Thank you
On Wed, Jun 30, 2021 at 11:46 AM Ruth Isserlin @.***> wrote:
Can you send me a sample of the output files for fgsea (just the top two or three lines of the file will be sufficient).
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/BaderLab/EnrichmentMapApp/issues/412#issuecomment-871517290, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUVLI5P4ILHD3CFFZUAZPCTTVM36DANCNFSM4M6AHR5Q .
Initial method - the format of the files that are coming out of fGSEA are not the same as the regular GSEA files. I would recommend converting them to generic enrichment files as opposed to changing them to the GSEA files. (There is a lot of extra info in GSEA format that can be annoying to add although it would be nice to have the NES values in your analysis) A detailed description of the file formats can be found here. https://enrichmentmap.readthedocs.io/en/latest/FileFormats.html#enrichment-results-files
I'm using fgsea results as well. I separated the up- and down-regulated pathway list based on NES value and used gmt file downloaded from MSigDB. I got this error below-
I'm using Cytoscape Version: 3.10 and enrichmentmap v3.3.6
My fgsea file column are like this:
pathway | pval | padj | log2err | ES | NES | size | leadingEdge | |
---|---|---|---|---|---|---|---|---|
1 | KEGG_MEDICUS_REFERENCE_TRANSLATION_INITIATION | 0.0001036 | 0.00043012 | 0.5384341 | 0.45868976 | 1.92531548 | 63 | RPL18A,RPL18,RPL4,RPL21,RPS10,RPS8,RPS7,RPL9,RPL24,RPS23,RPL7A,RPL35A,RPL13,RPL19,RPS17,RPS11,RPS4Y1,RPS12,RPL34,RPL28,RPL35,RPL31,RPS14,RPL36,RPL10A,RPL8,RPS3,RPS20,RPL23,RPL23A,RPL32,RPL11,RPL12,RPS28,RPS24,RPS9,RPS27,RPS25,RPL6,RPS19,RPS6,RPL37A,RPL15,RPL30 |
Could you please help me with this? I'm not sure where the error is rooted..
for your fgsea results to mimic GSEA results they need to have the following columns. Your current output format is not a recognized format for EM
NAME description GS DETAILS SIZE ES NES NOM p-val FDR q-val FWER p-val RANK AT MAX LEADING EDGE
Your above columns would need to be mapped as follows -
NAME --> Name
description --> Name
GS DETAILS --> you can put anything here
SIZE --> size
ES --> ES
NES --> NES
NOM p-val --> pval
FDR q-val --> padj
FWER p-val --> not used, just set to 0
RANK AT MAX --> this is used by EM but you need to calculate from your ranks file* and the leading edge in the results file. If you don't want to you can just set this to a random number but just know that the leading edge feature in EM won't be showing you the right answers.
LEADING EDGE --> leading edge
- to calculate the rank at max in R you could do something like this - where current_fgsea_results are the dataframe with your gsea results and current_ranks is the dataframe with your gene to rank mapping.
/calculate the rank at max fgsea returns the leading edge.
just need to extract the highest rank from set to get the rank at max/ calculated_rank_at_max <- apply(current_fgsea_results,1, FUN=function(x){ max(which(names(current_ranks) %in% unlist(x[8])))})
One last thing, it looks like the file that you outputted has the rows numbered. Make sure they you are exporting the results from R that you set rownames = FALSE