Pathway analysis using Sleuth results
Hello,
Did anyone try pathway analysis with GAGE and Pathview using Sleuth results (results_table.csv where column names are: target_id, test_stat, pval, qval, rss, sigma_sq, tech_var, mean_obs, var_obs, sigma_sq_max, smooth_sigma_sq, final_sigma_sq, degree_free, ens_gene, ext_gene)? I am following this tutorial https://www.r-bloggers.com/tutorial-rna-seq-differential-expression-pathway-analysis-with-sailfish-deseq2-gage-and-pathview/ but I have three samples with replicates (and more in future). I am able to get the pathways but not clear to which sample they belong to and also unable to get the figures showing up and down-regulated genes in the pathway. I think I need to troubleshoot the input for many samples with replicates. My code:
# Retrieving rows with qval < 0.05 from results_table.
>lrt_results.e <- dplyr::filter(results_table, qval < 0.05)
# Adding entrez column to Sleuth results for pathway analysis (where Rrr= species (using hypothetical name here)).
>lrt_results.e$entrez <- mapIds(org.Rrr.eg.db, keys=lrt_results.e$target_id, column="ENTREZID", keytype="ENSEMBLTRANS", multiVals="first")
# Retrieving rows where value of entrez is not NA
>lrt_results.e.noNA <- subset(lrt_results.e, !is.na(entrez))
>qval.e <- lrt_results.e.noNA$qval
>names(qval.e) <- lrt_results.e.noNA$entrez
# KEGG pathway gene sets for Rrr
>kegg_Rrr <- kegg.gsets("Rrr")
# sigmet.idx is an index of numbers of signaling and metabolic pathways in kegg.gsets.
>kegg_Rrr_sigmet <- kegg_Rrr$kg.sets[kegg_Rrr$sigmet.idx]
# Using gage package' >keggres.e <- gage(qval.e, gsets=kegg_Rrr_sigmet, same.dir=TRUE) >lapply(keggres.e, head)`
Output: Output_github.txt
>keggrespathways <- data.frame(id=rownames(keggres.e$greater), keggres.e$greater) %>% tbl_df() %>% filter(row_number()<=3) %>% .$id %>% as.character()
>keggrespathways
Output: [1] "Rrr01100 Metabolic pathways" "Rrr04020 Calcium signaling pathway"
[3] "Rrr04810 Regulation of actin cytoskeleton"
>keggresids <- substr(keggrespathways, start=1, stop=8)
>keggresids
Output: [1] "Rrr01100" "Rrr04020" "Rrr04810"
>plot_pathway <- function(pid) pathview(gene.data=qval.e, pathway.id=pid, species="Rrr", new.signature=FALSE)
>pmp <- sapply(keggresids, function(pid) pathview(gene.data=qval.e, pathway.id=pid, species="Rrr"))
Info: Downloading xml files for Rrr01100, 1/1 pathways..
Info: Downloading png files for Rrr01100, 1/1 pathways..
Info: Working in directory D:/Programs/kallisto/Rrr/quant_Sleuth
Info: Writing image file Rrr01100.pathview.png
Info: Downloading xml files for Rrr04020, 1/1 pathways..
Info: Downloading png files for Rrr04020, 1/1 pathways..
Hide Traceback
Rerun with Debug
Error in UseMethod("select_") :
no applicable method for 'select_' applied to an object of class "c('OrgDb', 'AnnotationDb', 'envRefClass', '.environment', 'refClass', 'environment', 'refObject', 'AssayData')"
11.
select_(.data, .dots = lazyeval::lazy_dots(...))
10.
select(db.obj, keys = in.ids, keytype = in.type, columns = c(in.type,
out.type))
9.
withCallingHandlers(expr, warning = function(w) invokeRestart("muffleWarning"))
8.
suppressWarnings(select(db.obj, keys = in.ids, keytype = in.type,
columns = c(in.type, out.type)))
7.
geneannot.map(in.ids = eg, in.type = "entrez", out.type = category,
org = org, pkg.name = pkg.name, ...)
6.
eg2id(as.character(plot.data.gene$kegg.names), category = "SYMBOL",
pkg.name = gene.annotpkg)
5.
pathview(gene.data = qval.e, pathway.id = pid, species = "Rrr")
4.
FUN(X[[i]], ...)
3.
lapply(X = X, FUN = FUN, ...)
2.
sapply(keggresids, function(pid) pathview(gene.data = qval.e,
pathway.id = pid, species = "Rrr"))
1.
sapply(keggresids, function(pid) pathview(gene.data = qval.e,
pathway.id = pid, species = "Rrr"))
Changed gene.data=qval.e to gene.data=lrt_results.e.noNA
>pmp. <- sapply(keggresids, function(pid) pathview(gene.data=lrt_results.e.noNA, pathway.id=pid, species="Rrr"))
Info: Working in directory D:/Programs/kallisto/Dr/quant_Sleuth
Info: Writing image file Rrr01100.pathview.png
Note: None of the genes or compounds mapped to the pathway!
Argument gene.idtype or cpd.idtype may be wrong.
'select()' returned 1:1 mapping between keys and columns
Info: Working in directory D:/Programs/kallisto/Dr/quant_Sleuth
Info: Writing image file Rrr04020.pathview.png
Note: None of the genes or compounds mapped to the pathway!
Argument gene.idtype or cpd.idtype may be wrong.
'select()' returned 1:1 mapping between keys and columns
Info: Working in directory D:/Programs/kallisto/Dr/quant_Sleuth
Info: Writing image file Rrr04810.pathview.png
Warning messages:
1: In colnames(plot.data)[c(1, 3, 9:ncs)] = c("kegg.names", "all.mapped", :
number of items to replace is not a multiple of replacement length
2: In colnames(plot.data)[c(1, 3, 9:ncs)] = c("kegg.names", "all.mapped", :
number of items to replace is not a multiple of replacement length
Figures:
![]()
![]()
![]()
![]()
![]()
Kindly guide regarding the input.
Thanks!
Hello, It would be very helpful if anyone has done pathway analysis with GAGE and Pathview using Sleuth results.
Thanks!
ENST-to-Reactome mappings (transcript level) are rather sparse (usually only the canonical isoform is annotated) so it may actually be hurting you to operate at the tx level. We found this out the direct way by running both. If your gene sets are very sparse, most all methods are going to suffer due to lack of structure.
--t
On Mon, Jan 23, 2017 at 3:44 AM, GK [email protected] wrote:
Hello, It would be very helpful if anyone has done pathway analysis with GAGE and Pathview using Sleuth results.
Thanks!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pachterlab/sleuth/issues/91#issuecomment-274467601, or mute the thread https://github.com/notifications/unsubscribe-auth/AAARIp5MsRVYfDYrV5KJFBO2GOdVgSNfks5rVJKQgaJpZM4LYI1s .
Hi, Any updates on it or do anyone have a workflow how to do Pathway analysis after sleuth or sleuth_live?
I did the below steps to get transcript mapping to gene ids which were based on this tutorial
diamond makedb --in nr.faa -d nr -p 8
diamond blastx -d nr -q Trinity.fasta -o nr.blastx.out -p 8 -k 1 --outfmt 6
echo -e "target_id\tens_gene" > t2g.txt
cut -f 1-2 nr.blastx.out >> t2g.txt
The below R code was based on:
#!/usr/bin/env Rscript
library("sleuth")
sample_id <- dir("kallisto")
sample_id
kal_dirs <- sapply(sample_id, function(id) file.path("kallisto", id))
kal_dirs
s2c <- read.table("!{exp_file}", header = TRUE, stringsAsFactors=FALSE)
s2c <- dplyr::select(s2c, sample = ids, Tissue_type, condition)
s2c <- s2c[order(s2c$sample), ]
s2c <- dplyr::mutate(s2c, path = kal_dirs)
print(s2c)
t2g <- read.table("!{t2g_file}", header = TRUE, stringsAsFactors=FALSE)
t2g
# Load the kallisto processed data into the object
so <- sleuth_prep(s2c, ~ condition, target_mapping = t2g, extra_bootstrap_summary = TRUE)
# Estimate parameters for the sleuth response error measurement (full) model
so <- sleuth_fit(so)
#Next, we fit the reduced model. In this case, the reduced model is the intercept-only model
so <- sleuth_fit(so, ~1, 'reduced')
print("Done sletu_fit")
#and finally perform the test:
so <- sleuth_lrt(so, 'reduced', 'full')
print("DONE sleut_lrt")
models(so)
save(so, file=paste("sleuth_object.so"))
Can I use sleuth_live or do I have to add in the above code the following line?
so <- sleuth_wt(so, 'conditionscramble')
gene_table <- sleuth_gene_table(so, test = "conditionscramble", test_type = "wt")
write.table(gene_table, paste("gene_table_results.txt"), sep="\t")
How do I know which test should I use?
Thank you in advance,
Michal
