Ragout icon indicating copy to clipboard operation
Ragout copied to clipboard

two small potential enhancements

Open yjx1217 opened this issue 7 years ago • 2 comments

Hello,

I was wondering if ragout can implement the following two potential enhancements:

  1. be able to sort the output contigs/scaffolds based on the chromosome/scaffold order (and maybe also orientation) of the input reference assembly. I think in the current implementation, the chromosome/scaffolds of the input reference assembly will be resorted by ragout. For example, when the chromosomes of the input reference are named as "chrI chrII chrIII chrIV chrV chrVI chrVII chrVIII chrIX chrX chrXI chrXII chrXIII chrXIV chrXV chrXVI", ragout will order chrIX before chrV in its current implementation. So the users will have to manually adjust the contig/scaffold ordering of the ragout output for the best comparison with the input reference assembly.
  2. provide an option for the user to decide how many 'N's to use for representing assembly gaps. I guess currently ragout will use a stretch of 15 Ns to represent assembly gaps. It might be better if use can have direct control of this. I think both of them should be relatively easy to implement. Thanks for consideration!

Best, Jia-Xing

yjx1217 avatar Apr 21 '17 12:04 yjx1217

Hi Jia-Xing,

Thank you and sorry for the late response. The first suggestion definitely looks like a good idea, I will add this to the next version. For the second suggestion - the gap sizes are currently estimated using the information from the reference, so their size is not fixed. However, there is a minimum number of Ns that Ragout generates (11 by default), which is why you may see many gaps like this. If you want to chenge this parameter, you may modify "min_scaffold_gap" setting in "ragout/shared/config.py" file/

Best, Mikhail

fenderglass avatar Jun 07 '17 21:06 fenderglass

Hi Mikhail,

Thank you very much for the explanation! It is very helpful!

Best, Jia-Xing

yjx1217 avatar Jun 08 '17 07:06 yjx1217