sleuth plot_ma function throws error for wald test results

Hello,

I get an error when using plot_ma() with a wald-test object. The error message is (I've translated it from german if it is not completely correct): Error in FUN(X[[i]], ...) : Object 'mean_obs' not found

After assuring that I really gave a wald-test object as a parameter to the function I dug into the source code of the function and found that on line 664 of the plots.R sleuth_results() is called without pval_aggregate=FALSE. I assume the default value is TRUE. The function then returns a result set without the columns 'mean_obs' and 'b'. This seems to be the source of the error message. After manually doing the steps, but with pval_aggregate=FALSE it seems to work just fine.

best, Nadine

Nov 12 '19 14:11 Rhinogradentia

The same behaviour occurs with plot_qq and plot_volcano.

best, Nadine

Nov 12 '19 14:11 Rhinogradentia

@AntonieV and I ran into the same issue on plot_volcano (with the exact error message also reported in issue #230 ). It is the following setup that triggered it for us:

We run sleuth_prep once, giving it an aggregation column (aggregation_column = genes). This implicitly sets pval_aggregate <- TRUE:

https://github.com/pachterlab/sleuth/blob/e5ac761e999a71af1d62b820d5ddebc4bd8dda56/R/sleuth.R#L322

But besides the p-values from sleuth_lrt() aggregated over genes, we as well wanted to have the raw p-values per transcript and the respective plot_volcano() with the results from sleuth_wt(). However, plot_volcano() doesn't check whether pval_aggregate == TRUE before calling sleuth_results() here:

https://github.com/pachterlab/sleuth/blob/e5ac761e999a71af1d62b820d5ddebc4bd8dda56/R/plots.R#L883

And sleuth_results() just works with the implicitly set pval_aggregate = TRUE from above and does aggregation over the genes, thus returning results without the b column in the case of plot_volcano().

Our fix was to assign so$pval_aggregate <- FALSE before calling sleuth_wt() and then plot_volcano(). But optimally, plot_volcano() would check this before calling sleuth_results() and throw an informative error message.

Also, as an aside: Should you at some point assign back to so$pval_aggregate <- TRUE, be aware that this currently triggers what we think is a bug in an existing sanity check (see PR #240).

Feb 07 '20 15:02 dlaehnemann

@dlaehnemann: I ran into the same issue as you (and reported here). Since your PR was ignored until now and this was not fixed in a different way either, I have come up with some workaround. I'd like to avoid calling sleuth_prep twice since I have many samples and boostraps so this takes long and uses a ton of memory. Instead, I call it once with (the implicit) pval_aggregate = TRUE, fit both a full (confounders + condition) and a reduced model (confounders only) and run both the LRT (so I can get gene level qvals in the end) and the WT (so I can generate a MA/volcano plot and get transcript level LFCs) on that same sleuth object. Then I use sleuth_results(so, test_type = "lrt", [...]) to get my gene level signifiance results and so %>% {.$pval_aggregate <- FALSE; .} %>% sleuth_results(test_type = "wt", [...]) for the transcript level effect sizes/directions. I also tried the same approach with sleuth_live and it appears to be working just fine (as long as I limit the data exploration to the Wald test, which is the default for sleuth_live_settings). Do you see any issue with this approach that made you opt for your slightly different workaround(s)?

@pimentel, @warrenmcg: Any chance for a proper/'official' fix for these issues? Is sleuth even still maintained at this point? Am I wasting my time if I try to submit at PR or two? :wink: And thank you for sharing this package and the time you have invested to maintain it so far.

Dec 06 '21 09:12 mschilli87

@mschilli87 We also only call sleuth_prep() once and our workaround is in this script:

https://github.com/snakemake-workflows/rna-seq-kallisto-sleuth/blob/0159228cf49e8a651fb69f07e83b8fa3d3f87ff9/workflow/scripts/sleuth-diffexp.R#L28-L38

The sleuth_prep() is run in a snakemake workflow step before that, the call is in this other script (and the sleuth object is saved and then loaded again):

https://github.com/snakemake-workflows/rna-seq-kallisto-sleuth/blob/0159228cf49e8a651fb69f07e83b8fa3d3f87ff9/workflow/scripts/sleuth-init.R#L54-L61

But this is a long time ago, now. So I'm not sure about the details of why exactly we did what. But I hope the code and the documenting comments already help a bit?

Dec 06 '21 17:12 dlaehnemann