pyScaf Less BUSCO genes after scaffolding.

Hi,

I would just like to make a return on the scaffolding of my assembly (Sanger technology) with PacBio reads (30x coverage), by using pyScaf.

pyScaf is fast and generates interesting results in the first place. I went from 2,059 scaffolds to 1,344 scaffolds, which was encouraging. Then I launched BUSCO on both assemblies and got the following results :

95.6% of complete BUSCO genes for my assembly (before pyScaf) and 78.7% of complete BUSCO genes after pyScaf. Before scaffolding, I have 37 missing genes, after pyScaf I have 284 missing genes.

I launched pyScaf with these parameters : pyScaf.py -f Scaffolds.fasta --identity 0.80 -o Scaffolds.pyScaf.fasta -t 10 --log pyScaf_run.log --longreads all_raw_reads.Pacbio.fasta

Maybe I have to change them ? Do you have any advice to me?

Feb 13 '18 12:02 a-velt

Hi, This is probably the same problem as the one mentioned in issue #3 :

Additionly, there might be some over-scaffolding that many contigs seemed with large overlap were linked directly (without any check such as whether the contigs overlapped actually).

In this example (.tsv output of a long read scaffolding run), a 2.4 Mb scaffold and a 3.3 Mb scaffold are merged into a 3.3 Mb scaffold. 2.4 Mb of non-redundant sequence is lost in the process.

scaffold00018 3324699 2 scaffold31_size2472606 scaffold20_size3324684 1 0 -3065490 0

Jun 09 '18 23:06 hgdarras

Hi !

Yes I found the problem ! I used OPERA to perform scaffolding of my Sanger assembly with PacBio reads and I saw that OPERA merged some contigs, generating this problem with BUSCO. As OPERA generates a file giving scaffolding information, I wrote a script to perform "manual" scaffolding without merging my contigs and it's perfect ! BUSCO is very good after that. If someone encounters such problems with OPERA, contact me and I will provide my script.

Thank you, Amandine

Jun 11 '18 07:06 a-velt

Hi Amandine @a-velt

I face the same question now. Could you share your script with me?

Thanks in advances Guangshuo

Jan 24 '22 21:01 liguangshuo

pyScaf pyScaf copied to clipboard

Less BUSCO genes after scaffolding.

pyScaf
pyScaf copied to clipboard