Ragout
Ragout copied to clipboard
two small potential enhancements
Hello,
I was wondering if ragout can implement the following two potential enhancements:
- be able to sort the output contigs/scaffolds based on the chromosome/scaffold order (and maybe also orientation) of the input reference assembly. I think in the current implementation, the chromosome/scaffolds of the input reference assembly will be resorted by ragout. For example, when the chromosomes of the input reference are named as "chrI chrII chrIII chrIV chrV chrVI chrVII chrVIII chrIX chrX chrXI chrXII chrXIII chrXIV chrXV chrXVI", ragout will order chrIX before chrV in its current implementation. So the users will have to manually adjust the contig/scaffold ordering of the ragout output for the best comparison with the input reference assembly.
- provide an option for the user to decide how many 'N's to use for representing assembly gaps. I guess currently ragout will use a stretch of 15 Ns to represent assembly gaps. It might be better if use can have direct control of this. I think both of them should be relatively easy to implement. Thanks for consideration!
Best, Jia-Xing
Hi Jia-Xing,
Thank you and sorry for the late response. The first suggestion definitely looks like a good idea, I will add this to the next version. For the second suggestion - the gap sizes are currently estimated using the information from the reference, so their size is not fixed. However, there is a minimum number of Ns that Ragout generates (11 by default), which is why you may see many gaps like this. If you want to chenge this parameter, you may modify "min_scaffold_gap" setting in "ragout/shared/config.py" file/
Best, Mikhail
Hi Mikhail,
Thank you very much for the explanation! It is very helpful!
Best, Jia-Xing