tidybulk test_gene_overrepresentation: visualising results

test_gene_overrepresentation: visualising results

Open mblue9 opened this issue 4 years ago • 7 comments

Do you have any recommendations for visualising the output of test_gene_overrepresentation?

Can it/should it use a clusterprofiler viz method: https://yulab-smu.github.io/clusterProfiler-book/ or a ggplot2 one?

May 31 '20 10:05 mblue9

Good question

in the past I have done something like this, for upregulated and downregulated, but is not so great

But on the website there is better examples, I'm wondering if I'm erasing key information that allows to build such plots.

May 31 '20 10:05 stemangiola

Just to follow up here, don't think this issue is a high priority (for workshop anyway) as may not have time for any pathway analyses, but if needed we could maybe have this below as a suggestion for how to visualise results. It's not using test_gene_overrepresentation it's using clusterProfiler itself with tidybulk test_differential_abundance output and tidyverse style and can get all the clusterprofiler plots.

library(clusterProfiler)
library(org.Hs.eg.db)

# extract all genes tested for DE
res <- counts_de_pretty %>% 
    pivot_transcript() %>% 
    filter(!lowly_abundant)

# GO terms
egoCC <- res %>%
    filter(FDR < 0.1 & logFC > 0 ) %>%
    pull( "transcript" ) %>%
    enrichGO(
      OrgDb = org.Hs.eg.db,
      keyType = 'SYMBOL',
      ont = "BP",
      universe = (res %>% pull( "transcript" ) ) )

dotplot(egoCC)
goplot(egoCC)
emapplot(egoCC)


# MSigDB Hallmark
gmtH <- read.gmt( "https://data.broadinstitute.org/gsea-msigdb/msigdb/release/6.2/h.all.v6.2.symbols.gmt" )
enrH <- enricher(
   gene = ( res %>% filter(FDR < 0.1 & logFC > 0) %>%
    pull( "transcript" ) ),
   TERM2GENE = gmtH,
   universe = ( res %>% pull( "transcript" ) ) )

dotplot( enrH )
emapplot(enrH)

Jun 08 '20 08:06 mblue9

I'm lost with this issue? Is it still relevant?

Oct 17 '20 07:10 stemangiola

Well I think we should have a tidybulk pathway/gene set analysis section at some point for a workshop.

For the moment I just put some info in the supplementary here https://stemangiola.github.io/biocasia2020_tidytranscriptomics/articles/supplementary.html#how-to-perform-gene-enrichment-analysis-1

But it doesn't use the tidybulk pathway analysis, it just uses tidybulk de results and then clusterprofiler viz:

dotplot(egoCC)
goplot(egoCC)
emapplot(egoCC)

Not sure whether better to use clusterprofiler for the viz or try to visualise tidybulk pathway results?

Oct 17 '20 07:10 mblue9

tidybulk can be used for calculation and attr(..., "") can be used to extract raw results and plotting them. Now sure if it's too messy. OK let's try to keep thinking about this

Oct 17 '20 07:10 stemangiola

Is this still relevant? @mblue9 any interest in doing a blog post on pathway analyses with tidybulk? so there would be a real application for me to get this improved.

Jun 26 '22 02:06 stemangiola

For me at the moment it's not a very high priority but I'd be happy to write a blog post if you want to focus on improving this aspect. Or we can wait til we have more time to work on it.

Just noting here I have a tiny bit on tidybulk pathway analysis here which we could build on using that dataset or airway or another https://mblue9.github.io/RNAseq-R-tidyverse/articles/tidytranscriptomics.html#gene-set-testing-1

Jun 27 '22 17:06 mblue9

tidybulk tidybulk copied to clipboard

test_gene_overrepresentation: visualising results

tidybulk
tidybulk copied to clipboard