smudgeplot icon indicating copy to clipboard operation
smudgeplot copied to clipboard

Question: dump sequences of hetmers

Open jts opened this issue 11 months ago • 3 comments

Hi,

Thanks for developing smudgeplot, I've found it to be very useful. I would like to do a deeper analysis of the results, which requires the sequences of each A/B hetmer (or a representative subsample) that smudgeplot finds. Is there an option or switch that will dump this output? Apologies if it has already been asked, I searched the issues but could not find anything.

Jared

jts avatar Feb 03 '25 21:02 jts

Hi Jared,

thanks for kind words, really appreciated.

This is a feature we are working on with @thegenemyers.

If you have a smaller dataset you would like to try it with and not wait for the newer version, we had this implemented in some older versions (v0.2.5 was the last one). Although there was also a bit of a bug in the search - we did not search both canonical and non-canonical representations of each k-mers which means that the set of k-mer pairs is not really guaranteed to be complete complete, or always represent unique k-mer pairs (empirically we know that it is relatively sensible proxy though).

KamilSJaron avatar Feb 03 '25 22:02 KamilSJaron

Thanks for the fast reply. The genome I am working on is quite large, but I may try v0.2.5 anyway. Please do let me know when this is re-implemented and I will try the new version too.

Jared

jts avatar Feb 04 '25 15:02 jts

Hi @jts, we do have a working prototype ... I mean kind working prototype (the interface is not pretty, but it generates lists of k-mers from individuals smudges). Is that still of an interest to you? If so, I will try to write up some documentation.

KamilSJaron avatar Mar 11 '25 20:03 KamilSJaron