qiime
qiime copied to clipboard
add support for paired difference tests (i.e., pre/post tests)
I've been working on several data sets lately looking at community change before/after a treatment, and have been working on some code for testing whether some value of interest changes with treatment. Due to the personalized nature of the microbiome, it's not always possible to see these changes e.g. in PCoA plots when coloring by pre/post treatment (because individuals continue to cluster with themselves if the effect size of treatment is less than the "personal microbiome" effect size), but I suspect that there can be detectable changes that are consistent across individuals nevertheless. I'll be porting this code into QIIME, and it will allow us to compare things like changes in relative abundance of specific OTUs/taxa, changes in alpha diversity, shifts in the same direction in PCoA space, or changes in some value in a mapping file.
Has anyone else been working with these types of data who would be interested in comparing notes?
Yes, this is something we have been doing a bunch of in other studies. Will be very useful to have in there generally rather than trawling through specific sets of comparisons, which I think is what people are doing currently. Definitely interested in comparing notes, do you think easiest way is on the mailing list or with call?
On Jul 3, 2013, at 8:04 AM, Greg Caporaso <[email protected]mailto:[email protected]> wrote:
I've been working several data sets lately looking at community change before/after a treatment, and have been working on some codehttps://gist.github.com/gregcaporaso/79dc526708b1e739ea8d for testing whether some value of interest changes with treatment. Due to the personalized nature of the microbiome, it's not always possible to see these changes e.g. in PCoA plots (because individuals continue to cluster with themselves if the effect size of treatment is less than the "personal microbiome" effect size). I'll be porting this code into QIIME, and it will allow us to compare things like changes in specific OTUs/taxa, changes in alpha diversity, or changes in some value in a mapping file.
Has anyone else been working with these types of data who would be interested in comparing notes?
— Reply to this email directly or view it on GitHubhttps://github.com/qiime/qiime/issues/1040.
Great - let's start with email because scheduling a call is going to be especially difficult with the holiday weekend, and I'm really hoping to have some time to work on this over the weekend. @rob-knight, could you give me the names of folks who have been working with these data sets in your group and I could get the conversation started by email (or we could just continue here).
Let's continue here and I will encourage people to participate if I don't see them responding.
On Jul 3, 2013, at 8:24 AM, "Greg Caporaso" <[email protected]mailto:[email protected]> wrote:
Great - let's start with email because scheduling a call is going to be especially difficult with the holiday weekend, and I'm really hoping to have some time to work on this over the weekend. @rob-knighthttps://github.com/rob-knight, could you give me the names of folks who have been working with these data sets in your group and I could get the conversation started by email (or we could just continue here).
— Reply to this email directly or view it on GitHubhttps://github.com/qiime/qiime/issues/1040#issuecomment-20418550.
And on a related topic, is there a way of specifying in the various distance comparison scripts a nested hierarchy of comparisons, eg in Costello et al we did distances for samples from same/diff site, then within site for same/diff subject, then within site and subject for within/between month? Would be very useful for quantifying effect of before/after treatment related to same/diff subject etc.
On Jul 3, 2013, at 8:24 AM, "Greg Caporaso" <[email protected]mailto:[email protected]> wrote:
Great - let's start with email because scheduling a call is going to be especially difficult with the holiday weekend, and I'm really hoping to have some time to work on this over the weekend. @rob-knighthttps://github.com/rob-knight, could you give me the names of folks who have been working with these data sets in your group and I could get the conversation started by email (or we could just continue here).
— Reply to this email directly or view it on GitHubhttps://github.com/qiime/qiime/issues/1040#issuecomment-20418550.
Not right now, as far as I'm aware. @jrrideout's distance comparison code would be the place to add that - @jrrideout, can you comment when you get chance (which might not be right away since he's en route to Catalina Island to teach QIIME at GeoBiology 2013).
Related interesting paper.
Sorry I forgot to reply here- there isn't a way to do this with the current distance plotting code. I agree this could be useful and may be a good addition to make_distance_boxplots.py. The current (messy) way to do this for the example @rob-knight gave would be to run make_distance_boxplots.py normally using body site. Then, use filter_distance_matrix.py to only include a single body site, then run make_distance_boxplots.py again using subject as the category. Finally, filter the previous distance matrix to only include a single subject, then run make_distance_boxplots.py using the month category.
Hello !
Interesting discussion. I also have a dataset to analyze which is constituted of sequential samples of patients during the course of an infection (6 patients, 4 sequential samples for each of them). The dataset has been generated using 454 16s bacterial rRNA sequencing and treated with the QIIME pipeline. I already performed some ecological analyses, but now I would like to specifically determine which OTUs vary during the course of the disease.
My initial idea was to use the group_significance.py script, but I saw it no longer supports paired samples.
Following Greg´s post I looked into the baySeq package based on the paper by Hardcastle and Kelly for Bayesian analysis of paired high-throughput data. However I am wondering if this analysis can be used on sequential samples or only on paired samples ? Or does anyone have a better suggestion for analyzing sequential data ?
Thanks ! Sébastien