Question: dump sequences of hetmers
Hi,
Thanks for developing smudgeplot, I've found it to be very useful. I would like to do a deeper analysis of the results, which requires the sequences of each A/B hetmer (or a representative subsample) that smudgeplot finds. Is there an option or switch that will dump this output? Apologies if it has already been asked, I searched the issues but could not find anything.
Jared
Hi Jared,
thanks for kind words, really appreciated.
This is a feature we are working on with @thegenemyers.
If you have a smaller dataset you would like to try it with and not wait for the newer version, we had this implemented in some older versions (v0.2.5 was the last one). Although there was also a bit of a bug in the search - we did not search both canonical and non-canonical representations of each k-mers which means that the set of k-mer pairs is not really guaranteed to be complete complete, or always represent unique k-mer pairs (empirically we know that it is relatively sensible proxy though).
Thanks for the fast reply. The genome I am working on is quite large, but I may try v0.2.5 anyway. Please do let me know when this is re-implemented and I will try the new version too.
Jared
Hi @jts, we do have a working prototype ... I mean kind working prototype (the interface is not pretty, but it generates lists of k-mers from individuals smudges). Is that still of an interest to you? If so, I will try to write up some documentation.