ray
ray copied to clipboard
add scaffolder metrics
That's a good point.
A good metric that Ray could produce to start with would be the number of pairs (including mates) with:
- both ends within a contig;
- one end on one contig and the other end on another contig
- one end on one contig and the other not mapped
- both ends not mapped
You suggest that a sizable part of the pairs (including mates) arein 3. and 4. when using a k-mer length of 61-91. That's likely.
I think it is probably the case as mate pairs usually include also an adapter too, and that consume previous space in the sequences.
For the time being, I believe that "use another scaffolder" is your best bet.
Speaking of scaffolders, I will soon (hopefully) fix the speed issue for scaffolding of large genomes due to repeated k-mers [1].