HiC-Pro icon indicating copy to clipboard operation
HiC-Pro copied to clipboard

Possible bug for SRA data processing: "Number of $PAIR1_EXT files is different from $PAIR2_EXT [$r1files vs $r2files]..."

Open joreynajr opened this issue 3 years ago • 3 comments

I managed to solve my issue but I wanted to bring this up especially because this might be a problem when people are trying to use SRA data which uses _1 and _2 to designate read1 and read2, respectively. In my case, I decided to set PAIR1_EXT = "_1" and PAIR2_EXT = "_2" within the HiC-Pro, however, if your samples happened to be named data_folder/sample_1/fastq_1.fastq.gz and data_folder/sample_1/fastq_2.fastq.gz then HiCPro thinks there are two R1 files and one R2 file and fails to run. There are simple workarounds on my end but I think this error is tricky and not easily Google-able so I wanted to bring this up. For reference, this issue is happening on lines 329-330 of the HiC-Pro main script (HiC-Pro/bin/HiC-Pro), pasting the code below for reference:

r1files=$(find -L $RAW_DIR -mindepth 2 -maxdepth 2 -name "*.fastq" -o -name "*.fastq.gz" -o -name "*.fq.gz" -o -name "*.fq" | grep "$PAIR1_EXT" | wc -l) #!

r2files=$(find -L $RAW_DIR -mindepth 2 -maxdepth 2 -name "*.fastq" -o -name "*.fastq.gz" -o -name "*.fq.gz" -o -name "*.fq" | grep "$PAIR2_EXT" | wc -l) #!

On lines: 331-332 you also get this error message "Number of $PAIR1_EXT files is different from $PAIR2_EXT [$r1files vs $r2files]. Please, note that the paired-end files are detected using the PAIR1_EXT/PAIR2_EXT parameters. Be sure that there is no conflict with files/dir names." which is somewhat helpful. Overall, I opened this issue in case others run into similar problems and this may help them out.

Joaquin

joreynajr avatar Apr 13 '21 01:04 joreynajr

Thanks @joreynajr. Indeed, this is a common issue. Which HiC-Pro version did you use ? I thought I improved that in the last version ... Best

nservant avatar Apr 13 '21 08:04 nservant

Hi @nservant, I am using the HiCPro singularity image which is using HiC-Pro 2.11.4 and maybe that is the problem? I wanted to avoid installing HiC-Pro but I can work with my workaround for now.

Thanks, Joaquin

joreynajr avatar Apr 13 '21 18:04 joreynajr

This is a recent version. So even if the v3.0.0 has been released, I'm not sure it will solve the issue. I'll keep this issue open, to see if we can improve that in the future. Thanks

nservant avatar Apr 13 '21 21:04 nservant