scikit-bio-cookbook
scikit-bio-cookbook copied to clipboard
add recipe to test for association between some phenotype and phylogeny
For example, we could look for associations between the geographic distribution of viruses and phylogeny. This could be done by computing euclidean distances between where (geographically) a particular viral strain is found, and the tip-to-tip distances between the viral strains in a phylogenetic tree, and then comparing the resulting matrices with a Mantel test.
Alternatively, in the macro-world, maybe something like beak length in some group of birds (but alternative suggestions would be great!). We could do this by computing a differences in beak length between the birds as a matrix, and compare that to the tip-to-tip distances from the tree using a Mantel test.
For geographic distance, should use Vincenty distance. This looks awesome, I'd like to toy with this one. Precursor though: fetching records from EBI?
Or, ideally, those and biom tables from the EMP database…?
On Sep 24, 2014, at 9:31 AM, Daniel McDonald <[email protected]mailto:[email protected]> wrote:
For geographic distance, should use Vincenty distancehttp://en.wikipedia.org/wiki/Vincenty's_formulae. This looks awesome, I'd like to toy with this one. Precursor though: fetching records from EBI?
— Reply to this email directly or view it on GitHubhttps://github.com/biocore/scikit-bio-cookbook/issues/15#issuecomment-56688398.
All good ideas. Note that all code is used in the cookbooks needs to be in a release version of skbio (or some other package that we know and love, such as biom, pandas, statsmodels, ...), or fully implemented in the notebook. We don't currently have EBI support in skbio, so that may be a limiting factor, though if records could be retrieved using the EBI REST interface and requests that could be really cool.
On Wed, Sep 24, 2014 at 8:36 AM, Rob Knight [email protected] wrote:
Or, ideally, those and biom tables from the EMP database...?
On Sep 24, 2014, at 9:31 AM, Daniel McDonald <[email protected] mailto:[email protected]> wrote:
For geographic distance, should use Vincenty distance< http://en.wikipedia.org/wiki/Vincenty's_formulae>. This looks awesome, I'd like to toy with this one. Precursor though: fetching records from EBI?
Reply to this email directly or view it on GitHub< https://github.com/biocore/scikit-bio-cookbook/issues/15#issuecomment-56688398>.
Reply to this email directly or view it on GitHub https://github.com/biocore/scikit-bio-cookbook/issues/15#issuecomment-56689150 .
Right, with the EMP dataset we have control over both ends of the pipe so can make it easy (though of course it might be hard to make it easy...)
On Sep 24, 2014, at 9:41 AM, "Greg Caporaso" <[email protected]mailto:[email protected]> wrote:
All good ideas. Note that all code is used in the cookbooks needs to be in a release version of skbio (or some other package that we know and love, such as biom, pandas, statsmodels, ...), or fully implemented in the notebook. We don't currently have EBI support in skbio, so that may be a limiting factor, though if records could be retrieved using the EBI REST interface and requests that could be really cool.
On Wed, Sep 24, 2014 at 8:36 AM, Rob Knight <[email protected]mailto:[email protected]> wrote:
Or, ideally, those and biom tables from the EMP database...?
On Sep 24, 2014, at 9:31 AM, Daniel McDonald <[email protected]mailto:[email protected] mailto:[email protected]> wrote:
For geographic distance, should use Vincenty distance< http://en.wikipedia.org/wiki/Vincenty's_formulae>. This looks awesome, I'd like to toy with this one. Precursor though: fetching records from EBI?
Reply to this email directly or view it on GitHub< https://github.com/biocore/scikit-bio-cookbook/issues/15#issuecomment-56688398>.
Reply to this email directly or view it on GitHub https://github.com/biocore/scikit-bio-cookbook/issues/15#issuecomment-56689150 .
— Reply to this email directly or view it on GitHubhttps://github.com/biocore/scikit-bio-cookbook/issues/15#issuecomment-56689863.