HiC-Pro icon indicating copy to clipboard operation
HiC-Pro copied to clipboard

[bowtie_pairing] Error 1

Open linshengnan09 opened this issue 2 years ago • 28 comments

Hello! l installed hic-pro by conda , but the step "Pairing of R1 and R2 tags ..." is always wrong, can you tell me how to solve this problem? Thank you! the log Run HiC-Pro 3.1.0

Tue Sep 21 23:35:28 CST 2021 Bowtie2 alignment step1 ... Logs: logs/sample1/mapping_step1.log Logs: logs/sample2/mapping_step1.log


Wed Sep 22 19:25:19 CST 2021 Bowtie2 alignment step2 ... Logs: logs/sample1/mapping_step2.log Logs: logs/sample2/mapping_step2.log


Thu Sep 23 01:51:50 CST 2021 Combine R1/R2 alignment files ... Logs: logs/sample1/mapping_combine.log Logs: logs/sample2/mapping_combine.log


Thu Sep 23 02:58:25 CST 2021 Mapping statistics for R1 and R2 tags ... Logs: logs/sample1/mapping_stats.log Logs: logs/sample2/mapping_stats.log


Thu Sep 23 04:23:58 CST 2021 Pairing of R1 and R2 tags ... Logs: logs/sample1/mergeSAM.log make: *** [bowtie_pairing] Error 1

the mergeSAM.log ~/miniconda3/envs/hic-pro_env/bin/python ~/02_software/hic-pro/HiC-Pro-3.1.0/scripts/mergeSAM.py -q 10 -t -v -f bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam -r bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam -o bowtie_results/bwt2/sample1/XB_merged_canu_asm.fasta.bwt2pairs.bam [E::idx_find_and_load] Could not retrieve index file for 'bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam' [E::idx_find_and_load] Could not retrieve index file for 'bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam'

mergeBAM.py

forward= bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam

reverse= bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam

output= bowtie_results/bwt2/sample1/XB_merged_canu_asm.fasta.bwt2pairs.bam

min mapq= 10

report_single= False

report_multi= False

verbose= True

Merging forward and reverse tags ...

Forward and reverse reads not paired. Check that BAM files have the same read names and are sorted.

linshengnan09 avatar Sep 23 '21 01:09 linshengnan09

Hi, It seems that your R1 and R2 files are not paired. Could you please show me the first lines of your fastq files to double check the read names ? Thanks

nservant avatar Sep 23 '21 07:09 nservant

R1: @A00511:346:HLNVMDSXY:1:1101:1054:1000 1:N:0:CAAGTCTA+GCCTTAAT ANTCCCGGAAAGTGCTGAGGTTTGGGCCCCTGAGACGAGAGACGTCAGGATAGACTGGGTTAGCCCCGGTTGGTTTTCAATTTATGAATCATCCTTCAAGTTTG + F#FFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFF

R2: @A00511:346:HLNVMDSXY:1:1101:1054:1000 2:N:0:CAAGTCTA+GCCTTAAT GACTTTGACGTTCGAGCGTGGCATTGGTATGTGGACTGGTGGTATGTTGGTTTGCGGTTTGGTTTGGAAAGATCGATCAAACGCCAGACGGAAGGCATGACTTG + FFFFF::FFFFFFFFFF,FFFFFFFFFFFFFFFFF,FF,FFF,FFFFFFFFF:,FFF,FFFFFFFFFF:FFF:FFFFFFFFFFFFFFFFFFF,:FFF:FFF:FF

linshengnan09 avatar Sep 23 '21 07:09 linshengnan09

I'm wondering if the "1:N:0" on one side, and the "2:N:0" on the other side, could explain the issue. But I though that it has been solved in a previous version ... Before cheking that, you are 100% sur that you have exactly the same reads in both R1/R2 files ? N

nservant avatar Sep 23 '21 08:09 nservant

I remember running with the same data before and there was no problem. Later, I ran again and there reported an error , and then I installed the latest version, but the same error was reported, and there was no problem with the data when I ran 3d-dna.

linshengnan09 avatar Sep 23 '21 08:09 linshengnan09

Would you mind sharing with me these two files please (private message) ; bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam

Or maybe just a few thousand reads. Thanks

nservant avatar Sep 23 '21 08:09 nservant

How can I sharing with you these two files ? The file is too big, shall I split it up and send it by mail?

linshengnan09 avatar Sep 23 '21 09:09 linshengnan09

yes, or use a share data system such as weTransfer for instance

nservant avatar Sep 23 '21 09:09 nservant

ok, Could you give me your email address?

linshengnan09 avatar Sep 23 '21 13:09 linshengnan09

ok,I have sent the file to you via weTransfer, please check.

linshengnan09 avatar Sep 23 '21 15:09 linshengnan09

got them. I'll come back to you N

nservant avatar Sep 23 '21 15:09 nservant

Sorry I cannot open the files. These are truncated SAM files, but I can't transform them in BAM files. could you send the complete file ?

nservant avatar Sep 23 '21 16:09 nservant

Sorry, the size of the complete file is 30G, so I converted the file to bam again, and send to you.

linshengnan09 avatar Sep 24 '21 01:09 linshengnan09

ok, so there is indeed an issue with the order of the reads which is different from the two files. So either, there is something wrong with the fastq files, or I'm wondering if something went wront at a sorting step ...

nservant avatar Sep 24 '21 10:09 nservant

The error occurs line 187 in the R1 file ;

>>samtools view XB_R2_merged_canu_asm.fasta.bwt2merged.bam.bam | awk '{print $1}' | head -190 | tail
A00511:346:HLNVMDSXY:1:1101:31882:1000
A00511:346:HLNVMDSXY:1:1101:31937:1000
A00511:346:HLNVMDSXY:1:1101:32081:1000
A00511:346:HLNVMDSXY:1:1101:32136:1000
A00511:346:HLNVMDSXY:1:1101:32316:1000
A00511:346:HLNVMDSXY:1:1101:32515:1000
A00511:346:HLNVMDSXY:1:1101:1063:1016
A00511:346:HLNVMDSXY:1:1101:1118:1016
A00511:346:HLNVMDSXY:1:1101:1262:1016
A00511:346:HLNVMDSXY:1:1101:1768:1016
(/data/users/nservant/projects_analysis/kdi_home/conda/hic-pro-3.1.0) 
nservant@u900-bdd-1-203n-6985:/data/tmp/nservant/hic-pro$
>>samtools view XB_R1_merged_canu_asm.fasta.bwt2merged.bam.bam | awk '{print $1}' | head -190 | tail
A00511:346:HLNVMDSXY:1:1101:31882:1000
A00511:346:HLNVMDSXY:1:1101:31937:1000
A00511:346:HLNVMDSXY:1:1101:32081:1000
A00511:346:HLNVMDSXY:1:1101:32136:1000
A00511:346:HLNVMDSXY:1:1101:32316:1000
A00511:346:HLNVMDSXY:1:1101:32515:1000
A00511:346:HLNVMDSXY:1:1101:32786:1000
A00511:346:HLNVMDSXY:1:1101:1118:1016
A00511:346:HLNVMDSXY:1:1101:1985:1016
A00511:346:HLNVMDSXY:1:1101:2040:1016

You see that the read order start to change ...

nservant avatar Sep 24 '21 10:09 nservant

It should be the sorting step , the bam files cannot be sorted, but how to solve this problem? I had change the 1.12 samtools version to 1.11, still did not solve the problem.

linshengnan09 avatar Sep 24 '21 12:09 linshengnan09

I do not think it is linked to the samtools version, but more to the RAM you are using the sort the data (or the disk space if the sort command is swapping). Would you have message related to sort in the log file when it merges the two mapping steps ?

nservant avatar Sep 24 '21 13:09 nservant

I check the mapping_combine.log:

~/01_bin/samtools merge -@ 36 -n -f bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam bowtie_results/bwt2_global/sample1/XB_R2_merged_canu_asm.fasta.bwt2glob.bam bowtie_results/bwt2_local/sample1/XB_R2_merged_canu_asm.fasta.bwt2glob.unmap_bwt2loc.bam ~/01_bin/samtools merge -@ 36 -n -f bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam bowtie_results/bwt2_global/sample1/XB_R1_merged_canu_asm.fasta.bwt2glob.bam bowtie_results/bwt2_local/sample1/XB_R1_merged_canu_asm.fasta.bwt2glob.unmap_bwt2loc.bam ~/01_bin/samtools sort -@ 36 -m 21M -n -T tmp/XB_R1_merged_canu_asm.fasta -o bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.sorted.bam bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam ~/01_bin/samtools sort -@ 36 -m 21M -n -T tmp/XB_R2_merged_canu_asm.fasta -o bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.sorted.bam bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam [bam_sort_core] merging from 6120 files and 36 in-memory blocks... [E::hts_open_format] Failed to open file "tmp/XB_R1_merged_canu_asm.fasta.1018.bam" : Too many open files samtools sort: fail to open "tmp/XB_R1_merged_canu_asm.fasta.1018.bam": Too many open files [bam_sort_core] merging from 6120 files and 36 in-memory blocks... [E::hts_open_format] Failed to open file "tmp/XB_R2_merged_canu_asm.fasta.1018.bam" : Too many open files samtools sort: fail to open "tmp/XB_R2_merged_canu_asm.fasta.1018.bam": Too many open files

linshengnan09 avatar Sep 24 '21 13:09 linshengnan09

yes well done ! So the samtools sort failed. If you look at your command, -@ 36 -m 21M, means that it only has 21Mo to sort the file which is too few. So it has to swap a lot, and generate too many tmp files.

This memory parameter is in your configuration file SORT_RAM. By default, it is set to 1000M, so I guess you change it to 21. Please try to increase this RAM parameter.

nservant avatar Sep 24 '21 13:09 nservant

Is the parameter in the bowtie_combine.sh script ? bowtie_combine.sh: ## Set a default for legacy config files that do not have SORT_RAM set if [[ "${SORT_RAM}" == "" ]]; then SORT_RAM="768" fi

linshengnan09 avatar Sep 24 '21 13:09 linshengnan09

Yes, but you have it in the config-hicpro.txt file. You do not need to modify the code.

nservant avatar Sep 24 '21 13:09 nservant

ok, I found that this parameter setting is missing in my config-hicpro.txt file

linshengnan09 avatar Sep 24 '21 13:09 linshengnan09

Ah ok, maybe that's an old config file. But that's strange, because as you pointed out, in this case, it should be fixed to 768 ... so I do not really understand why you have 21 in your log

nservant avatar Sep 24 '21 14:09 nservant

I had set SORT_RAM to 1000M in in the config-hicpro.txt, but it did'nt work. the mapping_combine.log: ~/01_bin/samtools merge -@ 36 -n -f bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam bowtie_results/bwt2_global/sample1/XB_R1_merged_canu_asm.fasta.bwt2glob.bam bowtie_results/bwt2_local/sample1/XB_R1_merged_canu_asm.fasta.bwt2glob.unmap_bwt2loc.bam ~/01_bin/samtools merge -@ 36 -n -f bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam bowtie_results/bwt2_global/sample1/XB_R2_merged_canu_asm.fasta.bwt2glob.bam bowtie_results/bwt2_local/sample1/XB_R2_merged_canu_asm.fasta.bwt2glob.unmap_bwt2loc.bam ~/01_bin/samtools sort -@ 36 -m 27M -n -T tmp/XB_R1_merged_canu_asm.fasta -o bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.sorted.bam bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam ~/01_bin/samtools sort -@ 36 -m 27M -n -T tmp/XB_R2_merged_canu_asm.fasta -o bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.sorted.bam bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam [bam_sort_core] merging from 4752 files and 36 in-memory blocks... [E::hts_open_format] Failed to open file "tmp/XB_R1_merged_canu_asm.fasta.1018.bam" : Too many open files samtools sort: fail to open "tmp/XB_R1_merged_canu_asm.fasta.1018.bam": Too many open files [bam_sort_core] merging from 4752 files and 36 in-memory blocks... [E::hts_open_format] Failed to open file "tmp/XB_R2_merged_canu_asm.fasta.1018.bam" : Too many open files samtools sort: fail to open "tmp/XB_R2_merged_canu_asm.fasta.1018.bam": Too many open files

linshengnan09 avatar Sep 25 '21 06:09 linshengnan09

Hi There is something wrong, if you look at your new logs, you are still using -m 27M N

nservant avatar Sep 27 '21 11:09 nservant

Sorry, I know what's going on ! Actually, the SORT_RAM parameter is divided by the number of CPUs For instance, using 1000M with 4 CPUs means that samtools sort is run with 250M of RAM. So it makes sense ... you have 1000M / 36 CPU = 27M of RAM.

I would suggest to decrease the number of CPU to 8 for instance ... this is enough ! or to increase again the SORT_RAM parameter. Best

nservant avatar Sep 27 '21 12:09 nservant

Hi I also have the same problem. But the combine.log only has two lines: $ more result/logs/rep1/mapping_combine.log /usr/local/anaconda/bin/samtools merge -@ 2 -n -f bowtie_results/bwt2/rep1/SRR401 5027_pass_2_hg19.bwt2merged.bam bowtie_results/bwt2_global/rep1/SRR4015027_pass_2 _hg19.bwt2glob.bam bowtie_results/bwt2_local/rep1/SRR4015027_pass_2_hg19.bwt2glob .unmap_bwt2loc.bam /usr/local/anaconda/bin/samtools merge -@ 2 -n -f bowtie_results/bwt2/rep1/SRR401 5027_pass_1_hg19.bwt2merged.bam bowtie_results/bwt2_global/rep1/SRR4015027_pass_1 _hg19.bwt2glob.bam bowtie_results/bwt2_local/rep1/SRR4015027_pass_1_hg19.bwt2glob .unmap_bwt2loc.bam

Would you please help me to debug?

seasky002002 avatar Jan 28 '22 16:01 seasky002002

您好,您的邮件我已收到。祝生活愉快,工作顺利!

linshengnan09 avatar Jan 28 '22 16:01 linshengnan09

and here is the main log... Thu Jan 27 20:11:41 CST 2022 Combine R1/R2 alignment files ... Logs: logs/rep1/mapping_combine.log make: *** [/usr/local/bin/HiC-Pro_2.11.4/bin/../scripts//Makefile:115: bowtie_com bine] Error 129 Hangup

seasky002002 avatar Jan 28 '22 16:01 seasky002002