Unicycler icon indicating copy to clipboard operation
Unicycler copied to clipboard

weird choice of "best assembly"

Open jflot opened this issue 3 years ago • 3 comments

image Here for this assembly although the score does increase with each increase of K the choice made seems particularly poor... Maybe the way the score is calculated is not suited to this type of situation?

jflot avatar Jan 26 '22 00:01 jflot

Unicycler chooses the 'best' assembly using contig count and dead end count, so I see why it made the choice it did. I don't, however, understand what's going on with SPAdes! I.e. why do k-mers 63+ result in such small assemblies?

If this is with the current version of Unicycler (v0.5.0), the raw SPAdes graphs should be in the output (prefixed with 001). They might shed some light on this. My hunch is that there is something weird/wrong with this read set - contamination maybe? But I don't know!

Ryan

rrwick avatar Jan 28 '22 04:01 rrwick

I am facing the same issue. Final assembly selected by Unicycler is quite small (~300K).

Is there a way to use other metrics to select the best assembly?

tamascogustavo avatar Oct 18 '22 14:10 tamascogustavo

I don't, however, understand what's going on with SPAdes! I.e. why do k-mers 63+ result in such small assemblies?

The answer is quite simple. Likely the input reads are only 75 bp long. So, after half of the read length the graph is just a pile of barely connected reads that are removed by the graph simplification algorithms.

asl avatar Oct 18 '22 14:10 asl