scikit-bio-cookbook
scikit-bio-cookbook copied to clipboard
add recipe on using normalized mutual information for RNA secondary structure prediction
This could take as input an alignment of functional RNA molecule sequences, and output a matrix of mutual information scores for all pairs of positions in the alignment. That matrix could be plotted as a heatmap, and "hot" diagonals would indicate regions of the sequence that may be base pairing. We can then use this recipe as location to point readers to more complex methods for doing this.
:+1:
Would it be possible to store the matrix as either a DissimilarityMatrix
or DistanceMatrix
? Once https://github.com/biocore/scikit-bio/issues/684 is complete, we'll be able to easily create heatmaps with these classes.
I think a DistanceMatrix would work for this, though semantically they're not distances (large values indicate correlation, not dissimilarity).
Ah, good point. I guess we can figure out the best way to do this when the recipe is ready, since the distance matrix heatmap functionality may not be in an skbio release yet. Maybe just a simple plotting function would suffice to avoid confusion with differences in semantics.