HapHiC icon indicating copy to clipboard operation
HapHiC copied to clipboard

Potential Chimeric Contig in HapHiC Scaffolding

Open Huyuxi08 opened this issue 6 months ago • 7 comments

Thank you so much for developing this great tool!

We used hifiasm to assemble a simple diploid genome using ONT reads, and then applied HapHiC for scaffolding. In our hifiasm assembly, some individual contigs already reached chromosome-scale length.

I ran HapHiC with the following parameters: haphic pipeline Schq.asm.bp.p_ctg.fa HiC.filtered.bam 9 --threads 10 --processes 9 --max_inflation 20 --correct_nrounds 2, and enabled two rounds of correction. However, it seems that HapHiC didn't break any chimeric contigs during the process.

In the final Hi-C contact map, I noticed a “cross-shaped blank area” within a single contig. I'm wondering — could this be caused by a chimeric contig? Or is it possibly a centromeric region with low Hi-C signal?

Do you have any suggestions or thoughts on this?

Image

Huyuxi08 avatar May 18 '25 19:05 Huyuxi08

I'm wondering — could this be caused by a chimeric contig? Or is it possibly a centromeric region with low Hi-C signal?

Actually, it's neither. The issue likely stems from repetitive sequences near the centromere causing Hi-C reads to have multiple alignments during mapping. These signals were filtered out when applying the MAPQ>=1 threshold during bam filtering.

zengxiaofei avatar May 19 '25 01:05 zengxiaofei

Hi,

Yes, you're right — I did apply MAPQ filtering when processing the BAM file, using the script you provided: filter_bam HiC.bam 1 --nm 3 --threads 14 | samtools view - -b -@ 14 -o HiC.filtered.bam

Given this situation, do you have any suggestions on how to better handle these repetitive regions near the centromere? I'm wondering if adjusting the filtering threshold or using a different strategy might help.

Thanks again for your help!

Huyuxi08 avatar May 19 '25 07:05 Huyuxi08

When the genome assembler (like hifiasm) generates contigs spanning these regions, no special handling is usually required as this is expected. Only when these centromeric regions are represented by multiple fragmented, signal-lacking contigs would you need to pay special attention.

zengxiaofei avatar May 19 '25 07:05 zengxiaofei

Thanks a lot!

Just to confirm — do you mean that this assembly can already be considered a valid final scaffolding result?

Also, I was wondering: if I run a tool like TGS-GapCloser to perform gap filling, would that potentially improve the centromeric region or is it unlikely to make much difference there?

Huyuxi08 avatar May 19 '25 07:05 Huyuxi08

I'm not entirely clear about the status of your genome assembly, especially in these centromeric regions. I'm uncertain whether the blue boxes in your Juicebox plot represent single contigs or scaffolds composed of multiple contigs. Additionally, the color range settings in your Juicebox plot aren't optimal, making many details difficult to discern. Therefore, I can't guarantee the complete reliability of your results.

What I can say is:

  • The missing signals you're observing result from mapping difficulties inherent to repetitive sequences - this is very common. It neither indicates assembly errors nor reflects true chromatin interaction strength.

  • Personally, I'm skeptical about the performance of current gap-closing tools in complex repetitive regions. When assembly software doesn't join two contigs, there's either conflicting path information or lack of read support. I'm not sure how gap-closing tools can resolve these issues without incorporating new sequencing data.

zengxiaofei avatar May 19 '25 08:05 zengxiaofei

Thank you very much for your detailed explanation!

The blue boxes in the Juicebox plot represent scaffolds derived from single contigs, not multiple merged ones. This is because the top 9 contigs from the hifiasm assembly already reached chromosome-scale lengths.

I've also re-adjusted the color range settings in Juicebox to improve the contrast and make the details more visible. I hope this helps with better interpretation of the Hi-C map.

Image

Huyuxi08 avatar May 19 '25 08:05 Huyuxi08

It looks good.

zengxiaofei avatar May 19 '25 09:05 zengxiaofei