hifiasm icon indicating copy to clipboard operation
hifiasm copied to clipboard

Inquiry Regarding Genome Assembly with ONT Data: Quality Comparisons and Recommendations

Open Bank-tidy opened this issue 6 months ago • 3 comments

Dear Developer,

I hope this message finds you well. First and foremost, I would like to extend my sincere gratitude for developing this remarkable software. It has been an invaluable tool in my research. However, I have encountered some queries that I hope you might help me clarify.

In my recent endeavors, I utilized all available Oxford Nanopore Technologies (ONT) data and also specifically extracted ONT data greater than 100kb, combining them with HiFi data for assembly. The resulting assemblies were then aligned to a reference genome and named all_ont_genome and 100k_ont_genome, respectively.

Upon analysis, I observed that the all_ont_genome had a contigs N50 of 60 Mb with 8 gaps, whereas the 100k_ont_genome exhibited a contigs N50 of 67 Mb but with only 3 gaps. This observation leads me to infer that the quality of the 100k_ont_genome might be superior to the all_ont_genome. This is somewhat puzzling to me, as the all_ont_genome also includes data over 100kb, yet its performance appears inferior. image

Could you kindly provide some insights into this observation? Additionally, what would be your recommended approach for assembly in such a scenario?

Thank you for your time and assistance. I eagerly await your response.

Best regards!

Bank-tidy avatar Dec 19 '23 08:12 Bank-tidy