racon icon indicating copy to clipboard operation
racon copied to clipboard

Racon for low heterozygote diploid genome

Open ghost opened this issue 6 years ago • 11 comments

Hello, is Racon expected to work for such genome? I ran, using PE illumina mapped with bwa mem2

racon -t 48 illumina.fq mapped.sam raw.fasta > racon.polished.fasta

the resulting fasta has 3 less contigs. However, the number of Het sites is increased: Before racon: 1150321 After racon: 1208714

Any idea of why this might be so? My genome is expected to have a heterozygosity of 2~3 %. I realise Racon might not be suitable for such case. Is Racon able to handle diploid sites?

Thank you. EDIT: I realise it might be a duplicate of issue #135

ghost avatar Aug 24 '19 22:08 ghost

Hello, are the 3 missing contigs small? Sometimes due to lack of coverage some smaller contigs drop out after polishing, but you can retain them with option -u (--include-unpolished). Which assembler did you use to get the raw assembly? How are you getting the information about Het sites? Can you explain how this issue is a duplicate of #135?

Best regards, Robert

rvaser avatar Aug 26 '19 07:08 rvaser

Oh I am really sorry it was issue #134 yes it seems they are small ones. Here are more information about the assembly:

The assembly was obtained using Flye with ONT (60 x coverage) and PacBio (140 x coverage) reads, then scaffolded with a 3C scaffolder (instagraal). For the Het sites, I mapped reads with bwa mem2 and called variants with deepvariants. I think the sequencing quality of the illumina reads I used is all right (see the plot) Rplot_illuminaBQ_ARCancestor

ghost avatar Aug 26 '19 09:08 ghost

Can you please copy the commands you used in running Racon with Illumina data (both mapping and Racon step)?

rvaser avatar Aug 26 '19 11:08 rvaser

./bwa-mem2 mem -t 48 -p raw.fasta illumina.fq > mapped.sam
racon -t 48 illumina.fq mapped.sam raw.fasta > racon.polished.fasta

ghost avatar Aug 26 '19 11:08 ghost

Looks alright. Not sure what to tell you here :/ You can try with minimap2 -ax sr and check whether it is better or not.

rvaser avatar Aug 26 '19 11:08 rvaser

okay I will do that. I will report to you (I am in a bottleneck situation on the cluster, so it might take a few days).

ghost avatar Aug 26 '19 11:08 ghost

Hello, so I did the polishing with minimap2 and the same is happening before: 1435144 after: 1847183

ghost avatar Aug 28 '19 10:08 ghost

Not sure what to tell you here.

rvaser avatar Aug 28 '19 10:08 rvaser

Well, it's a notoriously hard-to-assemble genome I am dealing with. So I am not so surprised that "regular" tools give unexpected results. If I find the cause I will let you know.

ghost avatar Aug 28 '19 10:08 ghost

Did you by any chance try Pilon in the same setting?

rvaser avatar Aug 28 '19 10:08 rvaser

That's the next step

ghost avatar Aug 28 '19 10:08 ghost