yahs
yahs copied to clipboard
Unexpected Zero Reads Pairs and Output Size with yahs and Juicer Preprocessing Pipeline
Hello Developers,
I have been utilizing yahs for Hi-C data processing followed by Juicer's preprocessing pipeline. My command for running yahs was:
yahs /NGS/Fungi/Rc/juicer/references/R0301HifiHicOnt.asm.hic.hap2.p_ctg.fa R0301onthap2.sort.bam -e GATC
During the execution of yahs, it recognized and logged the following information:
[I::find_re_from_seqs] number restriction enzyme cutting sites found in sequences: 336082
[I::find_re_from_seqs] restriction enzyme cutting sites density: 0.008152
[I::main] dump hic links (BAM) to binary file yahs.out.bin
[I::dump_links_from_bam_file] 1 million records processed, 0 read pairs
[I::dump_links_from_bam_file] 2 million records processed, 0 read pairs
However, even after processing millions of records from the BAM file, there were no read pairs detected. This was confirmed by the message:
0 read pairs processed
Subsequently, I proceeded with the Juicer preprocessing step using the following command:
juicer pre -a -o out_JBAT yahs.out.bin yahs.out_scaffolds_final.agp /NGS/Fungi/Rc/juicer/references/R0301HifiHicOnt.asm.hic.hap2.p_ctg.fa.fai
Upon completion, the output out_JBAT.txt
file contained no data, i.e., its size was effectively 0 bytes.
My question is, given the absence of read pairs in the yahs output, is it normal for the Juicer preprocessing step to generate an empty out_JBAT.txt
file? Could the lack of detected read pairs indicate an issue with either the alignment in the BAM file (R0301onthap2.sort.bam
) or how yahs is handling the data?
It seems unusual that no valid interactions would be identified, especially considering the large number of records processed. I would appreciate any insights into what might cause such an outcome and suggestions on how to troubleshoot this issue.
Thank you for your attention and assistance.
Best regards, Han jiangna
Hello @Hanjiangna,
Sorry for the delayed reply. This is usually caused by a malformatted BAM file. How did you generate your BAM file? If you can show me the header lines of your BAM file and a few lines of records, I can probably tell the reason.
Best, Chenxi
Hello Developer
Sorry for the late response, as I was occupied with various exams. Below is a screenshot of the header section and a few lines of the record entries.
Best regards,
Han jiangna
Hi Jiangna,
I can see two problems regarding the BAM file you showed.
- The SAM flags say the three read pairs were all properly mapped (the 2nd column), but none of them are really paired. They all have different read names (the 1st column). Two paired reads should be grouped together if sorted by read names.
- For all three read pairs, the two reads were mapped to the exact same position (as indicated by the 4th, 7th, 8th and 9th columns), which does not look right.
There is probably something wrong with your read mapping.
Best, Chenxi
Hello Developer Thanks your reply.I will check the step of read mapping. Best wishes! Han jiangna