scikit-bio-cookbook icon indicating copy to clipboard operation
scikit-bio-cookbook copied to clipboard

recipe for visualizing within vs between distances

Open jairideout opened this issue 10 years ago • 2 comments

@gregcaporaso and I were chatting (offline and in https://github.com/biocore/scikit-bio/issues/764) about adding a recipe showing how to visualize "within" vs "between" distances using scikit-bio (DistanceMatrix), pandas, and seaborn's boxplots. This recipe would basically show how to reproduce QIIME's make_distance_boxplots.py script.

@gregcaporaso suggested using the existing 88 Soils dataset that's already included with the cookbook to discretize pH and plot within/between distance boxplots.

This recipe would also be handy because it'll show how to use seaborn's boxplots with scikit-bio data so that we can deprecate skbio.draw.boxplots (https://github.com/biocore/scikit-bio/issues/764). Finally, it may inspire future additions to the DistanceMatrix API for extracting within/between distances.

jairideout avatar Nov 19 '14 19:11 jairideout

@jairideout, I was working on doing just this the other day. It makes use of pandas, seaborn and DistanceMatrix from skbio. The code is very messy as I was just using it for visualizations and data munging. I apologize I don't have time at the moment to write up a recipe, though anyone should feel free to use the code I wrote. There is at least an example of what the within and between boxplots look like a little ways down the page

Here is the notebook

johnchase avatar Nov 19 '14 20:11 johnchase

Awesome, thanks @johnchase!

jairideout avatar Nov 20 '14 14:11 jairideout