3d-dna icon indicating copy to clipboard operation
3d-dna copied to clipboard

FINAL.fasta is two fold of my initial contig

Open wuxiaopei0509 opened this issue 5 years ago • 8 comments

Hi, I use juicer and 3d-dna pipeline to scaffold my initial assembly ,How the obtained FINAL.fasta is about two fold of my initial genome and the scaffod number is also about fold of my initial genome,what's the problem? Thank you!

wuxiaopei0509 avatar Dec 20 '19 13:12 wuxiaopei0509

Hello, I also encountered this problem, did you solve it? If it was solved, can you tell me how to solve it? Thanks

DongnaMa avatar Jun 03 '20 11:06 DongnaMa

Hello DongnaMa and wuxiiaopei0509,

This is not a sufficiently detailed description of a problem for me to be able to meaningfully comment. Please submit the exact command as well as the out and err streams.

Thank you, Olga

dudcha avatar Jun 03 '20 14:06 dudcha

Hi Olga, Thank you for your prompt reply,I first ran the command: juicer.sh -g THS -s MboI -z reference/THS.polish.fasta -y reference/THS.polish_MboI.txt -p reference/THS.polish.chrom.size -D /home/stu_madongna/software/juicer-master -t 80, and I got the merged_nodups.txt file. Next, I ran the command: run-asm-pipeline.sh -r 2 THS.polish.fasta aligned/merged_nodups.txt, and obtained FINAL.fasta is about two fold of my initial genome. I also tried the parameter -r 0, FINAL.fasta was about the same as the initial genome, but it didn't seem to change anything. The initial genome has 2131 contigs, the FINAL.fasta has 2132 contigs and N50 has not changed.

DongnaMa avatar Jun 04 '20 01:06 DongnaMa

Thanks, I seem to know my problem, I did not install parallel software

DongnaMa avatar Jun 04 '20 01:06 DongnaMa

Hey DongnaMa, At some point we have stopped supporting the non-parallel version, so it may cause problems though if there are no error messages I imagine it is not that. When you say fasta is twice as large the only interpretation I can have of this is that that's a lot of Ns added to a small genome, while the actual total number of sequenced bases is the same, at least it is supposed to when running in haploid mode. Regardless, looking at final fasta is not a good strategy. The recommended way is to look at .hic files. Check out the dnazoo.org/methods genome assembly cookbook for overview and some recommendations. Also consider checking aidenlab.org/forum.html for relevant discussions with users. Hope this helps, Olga

dudcha avatar Jun 04 '20 05:06 dudcha

Hi Olga, I recalculated in parallel version,with haploid mode. The result is still the same and no error. My initial genome size is 900M, with 2131 contigs. The contig N50 was 7Mb. Will this assembly result affect the operation of the 3d process? DongnaMa

DongnaMa avatar Jun 04 '20 07:06 DongnaMa

DongnaMa, Please see the previous message. -Olga

On Jun 4, 2020, at 12:02 AM, Luffy [email protected] wrote:

 Hi Olga, I recalculated in parallel version,with haploid mode. The result is still the same and no error. My initial genome size is 900M, with 2131 contigs. The contig N50 was 7Mb. Will this assembly result affect the operation of the 3d process? DongnaMa

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

dudcha avatar Jun 04 '20 07:06 dudcha

Hello, I also encountered this problem, did you solve it? If it was solved, can you tell me how to solve it? Thanks

Hi DongnaMa. I also have the same questions with you. Have you figure out it. would you mind share your methods for us. Thanks so much.

GitHub-Lujianjun avatar Sep 27 '21 06:09 GitHub-Lujianjun