fastp icon indicating copy to clipboard operation
fastp copied to clipboard

Warning label: different read numbers in pack

Open bamorim-bio opened this issue 3 years ago • 8 comments

Hi I was hoping you could clarify a warning label that I got:

fastp --in1 /Users/beatrizamorim/Desktop/mtDNA/data/DEMI115_R1.fastq.gz
--in2 /Users/beatrizamorim/Desktop/mtDNA/data/DEMI115_R2.fastq.gz
--detect_adapter_for_pe
-c
-p
--qualified_quality_phred 30
--dedup --out1 DEMI115_R1.qc.fq.gz
--out2 DEMI115_R2.qc.fq.gz
--unpaired1 singletons.DEMI115.qc.fq.gz
--unpaired2 singletons.DEMI115.qc.fq.gz
--failed_out failed.DEMI115.qc.fq.gz
--json fastp.DEMI115.json
--html fastp.DEMI115.html
--thread 4

Detecting adapter sequence for read1... No adapter detected for read1

Detecting adapter sequence for read2...

Illumina TruSeq Adapter Read 2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT

WARNNIG: different read numbers of the 932 pack Read1 pack size: 1000 Read2 pack size: 91

This is the first time I get an error like this, all the samples before this one didnt have any problem. The program got stuck on this error.

Does this mean there is something wrong with my reads? Should I exclude this sample altogether?

i am using version: 0.23.1

bamorim-bio avatar Nov 20 '21 16:11 bamorim-bio

Hi @bamorim-bio I have the same warning. Did you figure this out?

georgia-katsoula avatar Dec 12 '21 12:12 georgia-katsoula

Can you constantly reproduce this issue when you rerun the command?

sfchen avatar Dec 12 '21 13:12 sfchen

Thank you for the quick response. Yes I get this error for one of my samples (reran it 3 times) and the process get stuck. My command looks like that - it a part of Snakemake file-:

fastp \
          -i {input.fq1} \
          -o {output.trimmed_1} \
          -I {input.fq2} \
          -O {output.trimmed_2} \
          --unpaired1 {output.unpaired_1} \
          --unpaired2 {output.unpaired_2} \
          --failed_out {output.failed} \
          --detect_adapter_for_pe \
          --overrepresentation_analysis \
          --qualified_quality_phred 30 \
          --html {output.report_html} \
          --json {output.report_json} 2>&1 > {log}

Output:

Detecting adapter sequence for read1...
No adapter detected for read1

Detecting adapter sequence for read2...
CTCATTTACACCAACCACCCAACTATCTATAAACCTAGCCATGGCCATCCCCTTATGAGC


WARNNIG: different read numbers of the 22739 pack
Read1 pack size: 173
Read2 pack size: 1000

georgia-katsoula avatar Dec 12 '21 14:12 georgia-katsoula

Hi @bamorim-bio I have the same warning. Did you figure this out?

Hi! I did. So I figure that this error occurred with samples that had different number of sequences in the different reads. So for example, I had sample A that had 1.7M seqs of reads 1 and 1M of reads 2. I had to find a way to fix this as the error kept persisting with my downstream analysis (even if I found other software like fastp that could run with these unequal lengths, when aligning with BWA I had also had errors).

What ended up working for me was repairing reads with bbtools

repair.sh -Xmx14g in1=SampleA_R1.fastq.gz in2=SampleA_R2.fastq.gz out1=SampleA_R1_repaired.fastq.gz out2=SampleA_R2_repaired.fastq.gz outs=/SampleA_single.fastq.gz repair

Afterwards, fastp worked fine!

bamorim-bio avatar Dec 12 '21 14:12 bamorim-bio

Thank you so much @bamorim-bio for taking the time! I will try this out. :)

georgia-katsoula avatar Dec 12 '21 14:12 georgia-katsoula

Thank you so much @bamorim-bio for taking the time! I will try this out. :)

Let me know if you need help or if that didn't work for you!

I also didn't mention but I saw that the reads had different numbers of sequences while doing QC with FastQC!

my email is [email protected] :)

bamorim-bio avatar Dec 12 '21 16:12 bamorim-bio

Hi, I got this same problem, and fastp runs in the background all the time and doesn't stop with status "S". Info in log file: WARNNIG: different read numbers of the 30908 pack Read1 pack size: 224 Read2 pack size: 1000

LvLH avatar Jun 27 '22 07:06 LvLH

I think fastp code should be altered, if possible, to catch the cause of this error and make the program exit gracefully instead of hanging indefinitely.

fastp -i /home/input/sample_1.fq.gz -q 20 -l 50 -o qc/sample_1.fq.gz -I /home/input/sample_2.fq.gz -O qc/sample_2.fq.gz --json qc/fastp.json --html qc/fastp.html --disable_adapter_trimming --failed_out qc/fail.fq.gz

Result after 15 hrs (I forgot and left it to run overnight): WARNNIG: different read numbers of the 4614 pack Read1 pack size: 169 Read2 pack size: 1000

jessicarowell avatar Sep 09 '22 13:09 jessicarowell

Hi @bamorim-bio, thanks for posting the BBTools fix. I still get the "different read numbers" warning, but fastp now runs to the end!

vinisalazar avatar Feb 14 '23 04:02 vinisalazar

@sfchen thank you for providing fastp. It's amazing.

Would you have an estimate as to when you could incorporate these fixes and release a new version? I'm sure it would benefit many users.

Best, V

vinisalazar avatar Feb 14 '23 04:02 vinisalazar